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Preface 


The  Advanced  Research  and  Development  Activity  (ARDA)  within  the  U.S.  intelligence 
community  (IC)  has  several  research  “thrusts,”  including  one  on  advanced  Information 
Assurance  (IA)  headed  by  Richard  C.  Brackney.  On  March  2-4,  2004,  an  unclassified  work¬ 
shop  was  held  at  the  offices  of  McAfee  Security  (a  division  of  Network  Associates,  Inc.)  in 
Rockville,  MD.  The  topic  was  “Understanding  the  Insider  Threat.” 

The  format  of  the  workshop  combined  plenary  sessions  and  four  “breakout”  groups, 
whose  specialized  topics  were  the  following: 

•  Intelligence  Community  (IC)  System  Models 

•  Vulnerabilities  and  Exploits 

•  Attacker  Models 

•  Event  Characterization. 

The  workshop  brought  together  members  of  the  IC  with  specific  knowledge  of  IC 
document  management  systems  and  IC  business  practices;  persons  with  knowledge  of  insider 
attackers,  both  within  and  outside  the  IC;  and  researchers  involved  in  developing  technology 
to  counter  insider  threats. 

These  proceedings  contain  an  overview  of  the  findings  from  this  workshop  and  the 
display  charts  from  briefings  given  to  workshop  participants.  This  document  should  be  of 
interest  to  researchers  investigating  methods  for  countering  the  insider  threat  to  sensitive 
information  systems,  and  to  members  of  the  intelligence  community  concerned  with  the 
insider  threat  and  its  mitigation. 

The  RAND  Corporation’s  research  for  ARDA’s  IA  thrust  is  conducted  within  the 
Intelligence  Policy  Center  (IPC)  of  the  RAND  National  Security  Research  Division 
(NSRD).  RAND  NSRD  conducts  research  and  analysis  for  the  Office  of  the  Secretary  of 
Defense,  the  Joint  Staff,  the  Unified  Commands,  the  defense  agencies,  the  Department  of 
the  Navy,  the  U.S.  intelligence  community,  allied  foreign  governments,  and  foundations. 

For  more  information  on  the  Intelligence  Policy  Center,  contact  the  Acting  Director, 
Greg  Treverton.  He  can  be  reached  by  e-mail  at  Greg_Treverton@rand.org;  by  phone  at 
(310)  393-0411;  or  by  mail  at  RAND,  1776  Main  Street,  Santa  Monica,  CA,  90407-2138. 
More  information  about  RAND  is  available  at  www.rand.org. 
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Summary 


A  major  research  thrust  of  the  Advanced  Research  and  Development  Activity  (ARDA)  of  the 
U.S.  intelligence  community  (IC)  involves  information  assurance  (LA).  Perhaps  the  greatest 
threat  that  LA  activities  within  the  IC  must  address  is  the  “insider  threat” — malevolent  (or 
possibly  inadvertent)  actions  by  an  already  trusted  person  with  access  to  sensitive  information 
and  information  systems. 

This  unclassified  workshop,  held  March  2-4,  2004,  focused  on  the  insider  threat  and 
possible  indicators  and  warnings,  observables,  and  actions  to  mitigate  that  threat.  The  ARDA 
researchers  participating  gave  special  attention  to  the  activities,  processes,  and  systems  used 
within  the  intelligence  community. 

A  combination  of  plenary  and  breakout  sessions  discussed  various  aspects  of  the 
problem,  including  IC  system  models,  vulnerabilities  and  exploits,  attacker  models,  and 
characterization  of  events  associated  with  an  insider  attack.  A  set  of  presentations  by  mem¬ 
bers  of  the  IC  and  its  contractors  on  Intelink  (Appendix  G)  and  such  research  activities  as  the 
development  of  “Glass  Box”  software  (see  Appendix  H)  and  ARDA’s  “Novel  Intelligence 
from  Massive  Data”  (NIMD)  research  program  (Appendix  I)  aided  the  workshop  discus¬ 
sions.  The  present  workshop  built  upon  the  availability  of  materials  generated  in  an  earlier 
workshop  focused  on  the  insider  threat  (Appendix  F). 

Several  overall  themes  emerged  from  these  deliberations,  discussed  below  under  the 
headings  of  “Research  Questions  and  Challenges”  and  “Databases  Needed”  (by  researchers). 


Intelligence  Community  System  Models 

The  overall  intelligence  process  involves  requirements,  collection,  processing  and  exploita¬ 
tion,  analysis  and  production,  dissemination,  and  consumption,  with  feedback  loops  at  all 
steps,  as  shown  in  Figure  S.l. 

Variant  models,  such  as  the  NSA  Reference  Model  (NRM),  also  exist.  Of  key 
concern  to  this  group  of  researchers  was  the  question:  What  “observables”1  can  be  obtained 
at  all  stages  of  this  process  that  would  allow  comparison  of  normal  analyst  activity  with 
abnormal  activity — which  is  potentially,  but  not  necessarily,  malevolent?  Figure  S.2  provides 
an  indication  of  the  richness  of  the  concept  of  “observable”;  it  is  a  taxonomy  developed  by 
the  earlier  insider  threat  workshop  cited  above.  Similar  taxonomies  characterize  IC  “assets” 
and  users. 


1  An  observable  is  anything  that  can  be  detected  with  current  technology.  A  number  of  workshop  participants  argued  that 
this  definition  should  be  broadened  to  include  foreseeable  future  technological  developments. 
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Figure  S.1 

Intelligence  Process 
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Figure  S.2 

Taxonomy  of  Observables 
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Vulnerabilities  and  Exploits 

What  types  of  exploits2  might  an  insider  use  to  obtain  information,  alter  its  integrity,  or 
deny  its  availability  to  those  who  need  it?  This  workshop  concentrated  on  cyber- related 


2  The  noun  exploit  is  often  used  within  the  intelligence  community  to  mean  the  development  of  a  plan  (and,  usually,  its 
subsequent  execution — often  surreptitiously)  to  obtain  information  or  an  advantage. 
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exploits  because  they  were  felt  to  be  potentially  the  most  damaging  and  most  likely  to 
increase  in  the  future,  as  a  new  generation  of  analysts  emerges  with  more  computer  skills 
than  the  previous  generation. 

Workshop  participants  generated  a  list  of  33  example  exploits.  For  each  they  listed  a 
brief  description,  preconditions  that  would  allow  the  exploit  to  happen,  observables  that 
might  be  generated  during  the  exploit,  and  effects  of  the  exploit  (usually  one  of  the  follow¬ 
ing:  a  breach  of  confidentiality,  integrity,  or  availability,  or  an  enabler  of  other  exploits).  The 
short  titles  of  the  vulnerabilities  are  listed  in  Table  S.l.  Further  details  may  be  found  in 
Chapter  Three. 


Attacker  Models 

Figure  S.3  shows  an  overall  model  of  the  steps  involved  if  a  malevolent  insider  were  to 
“mount  an  attack”  against  an  IC  asset.  The  attack  might  be  as  simple  as  obtaining  access  to 
information  he  or  she  does  not  have  a  need  to  know  or  as  complex  as  disabling  a  key  intelli¬ 
gence  collection/processing/dissemination  system. 

Another  way  of  depicting  attacker  actions  is  shown  in  Figure  S.4.  Here  the  attacker 
steps — motivation,  benefit/risk  assessment,  acquiring  the  “client,”  collecting  payment — were 


Table  S.l 

Vulnerabilities  and  Exploits 


1.  Virus-laden  CD  and/or  USB  flash  drive  and/or  floppy 

2.  Administrator  lockout 

3.  Social  engineer  passwords 

4.  Retry  Internet  attacks 

5.  Smuggling  out  USB  flash  device  or  other  media 
(exfiltration) 

6.  "Missing"  laptops/hardware 

7.  Targeted  acquisition  of  surplus  equipment 

8.  Un  patched  systems 

9.  Sabotaged  patches 

10.  False  positives  on  anti-virus 

11.  Use  of  unattended  terminal 

12.  Targeting  database  "adjustments" 


18.  Mislabeled  paper 

19.  Netmeeting/WebEx  controls 

20.  "Day  zero"  attacks  based  on  source  code 
availability 

21.  Covert  channels  through  steganography3 

22.  Copy  and  paste  between  classifications  (from  high 
to  low) 

23.  Internal  e-mail  that  performs  attacks 

24.  Wireless  telephone  cameras  to  capture 
information 

25.  Telephone  tap  recording  onto  removable  media 

26.  Telephone  tap  via  hacking  PBX  telephone 
controller 

27.  Analyst  changes  workflow  to  exclude  other 
analysts  (dissemination) 

28.  Analyst  changes  workflow  to  include 
himself/herself 

29.  Insert  bad  content  into  report  upon  inception  (e.g. 
translation) 


13.  Install  software  on  host  computer  to  capture 
keystrokes  logger 

14.  Extra  copy  of  DB  backups 

15.  Wireless  transmissions 

16.  Cell  phone/PD A/voice  recorder  in  classified 
meeting 

17.  Suspicious  activity  on  real  systems  (e.g.,  searching 
own  name  in  databases) 


30.  Delete/withhold  content  into  report 
upon  inception 

31.  Redirect  analyst  resources  to  support 
adversary's  agenda 

32.  Poor  quality  analysis/results/ 
reports 

33.  Get  IC  asset  to  collect  info  that  benefits  an 
unauthorized  party 


aSteganography  is  the  hiding  of  information  by  embedding  in  an  innocuous  message  or  file,  such  as  a  digitized 
picture. 
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Figure  S.3 

Spiral  Model  Flowchart 


Figure  S.4 

Insider  Attack  Actions  (white  items  not  cyber  observable) 


Attack 

'  • 

■  ■■■  l 

■  \  - 

. 

Benefit/Risk  Assessment 

M 

Acquire 

Client 

0 

T 

Entrenchment 

..  •  ; 

l 

V 

A 

Access 

Exploitation 

iS:#Yw 

T 

1 

0 

N 

Recon 

Extraction 

Exfiltration 

Manipulation 

Communication 

Collect 

Payment 

Countering  Cl 

^  ^  1 

deemed  not  to  generate  cyber  observables  (that  is,  they  would  not  be  detected  by  information 
systems  now  in  use  or  with  enhancements  planned  by  researchers  and  developers). 

Given  the  various  steps  an  attacker  follows,  as  shown  in  Figure  S.4,  which  steps  are 
candidates  for  using  the  vulnerabilities  and  exploits  shown  in  Table  S.l?  The  answer  is 
shown  in  Figure  S.5,  where  the  unitalicized  insider  actions  have  parenthesized  numbers 
linking  them  to  numbered  entries  in  Table  S.L  The  parenthesized  suffix  letters  C,  I,  A,  E 
indicate  whether  the  actions  would  lead  to  a  breach  of  information  Confidentiality,  Integrity, 
Availability,  or  would  be  an  Enabler  of  other  attacks. 
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Figure  S.5 

Insider  Actions  Taxonomy  Cross-Referenced  with  Vulnerabilities  and  Exploits  (V&E)  List 
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Event  Characterization 

As  attacker  actions  generate  observables  through  the  operation  of  “detectors”  of  those  observ¬ 
ables,  indicators  of  possible  abnormal  activity  are  generated.  Those  indicators  can  form  a 
report;  multiple  reports  can  be  fused  into  an  “incident”;  and  multiple  incidents  then  fused 
into  a  “case”  of  one  or  more  incidents.3  That  process  is  shown  graphically  in  Figure  S.6. 


Research  Questions  and  Challenges 

Each  breakout  group  tried  to  formulate  a  set  of  research  questions  arising  from  its  delibera¬ 
tions.  Some  groups  stated  these  questions  in  the  form  of  “grand  challenges”  to  be  addressed. 
We  summarize  the  key  questions  and  challenges  below. 

Six  Categories  of  Research  Questions 

Research  issues  tended  to  fall  within  six  categories: 

1. User  roles 

2.  Actions 


3  We  assume  that  a  “case"  may  be  merely  a  collection  of  incidents  having  some  commonality  to  be  watched,  or  it  could  be 
the  result  of  a  post-facto  analysis  of  source,  cause,  damage,  etc. 
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Figure  S.6 

Data  Collection  Steps  Regarding  an  Event 


3.  Observables  (events) 

4.  Sensors 

5.  Fusion  and  analysis  (both  spatial  and  temporal) 

6.  “Triggers”  (priorities,  and  level  of  certainty). 

The  first  four  categories  each  require  languages  to  describe  them ,  and  means  for  map¬ 
ping  each  into  the  next  (i.e.,  from  a  description  of  user  roles  to  a  set  of  described  user  actions, 
which  in  turn  lead  to  a  set  of  potential  observables.  Those  observables  are  then  sensed  and 
the  sensed  signals  fed  into  fusion  and  analysis  programs,  which  in  turn  create  actions  and 
alerts  within  the  system). 

An  additional  common  thread  is  the  need  for  correlation  and  management  tools  to  cor¬ 
relate  multiple  events  or  triggers  with  an  incident,  to  correlate  multiple  events  with  a  case, 
and  to  correlate  multiple  cases  into  a  coordinated  attack. 

The  topic  of  sensors  (item  4  in  the  above  bulleted  list)  requires  substantial  research  in 
at  least  the  following  areas: 

•  Identification  of  information  that  should  go  into  an  event  record 

•  Development  of  sensors  specific  to  particular  applications 

•  Standardization  of  event  record  syntax  and  semantics;  scales  of  severity  and  confi¬ 
dence;  system  interfaces;  and  means  for  establishing  an  inviolate  “chain  of  evidence” 

•  Detection  of  “low  and  slow”  attacks 

•  Optimization  of  selection,  placement,  and  tuning  of  sensors 

•  Tradeoffs  in  adaptability:  How  do  you  recognize  legitimate  changes  in  user  behavior? 
How  do  you  resist  the  “conditioning”  of  sensors  by  a  malicious  insider  (through  a 
pattern  of  actions  that  “migrate”  the  sensor  from  a  nominal  setting  to  one  that  won't 
recognize  the  attack)? 

•  Development  of  validation  and  test  data  and  techniques  (see  “Databases  Needed,” 
below). 
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Challenges 

Participants  stated  several  “grand  challenges”  for  researchers: 

•  Define  an  effective  way  of  monitoring  what  people  do  with  their  cyber  access,  to  iden¬ 
tify  acts  of  cyber  espionage.  Focus  on  detection,  not  prevention.  Such  monitoring  (or 
the  perception  of  monitoring,  which  may  suffice  in  some  cases)  can  be  an  effective 
deterrent. 

•  Develop  policies  and  procedures  to  create  as  bright  a  line  as  possible  between  allowed 
and  disallowed  behaviors  (i.e.,  reduce  the  ambiguity). 

•  Consider  sociological  and  psychological  factors  and  create  better  cooperation  between 
information  systems  personnel  and  human  resources  personnel  (including  security, 
medical,  financial,  and  other  support  services).  In  short,  broaden  oversight  of  all 
aspects  of  a  user’s  background  and  behaviors. 

•  Combine  events  from  one  or  more  sensors  (possibly  of  various  types  or  different  levels  of 
abstraction)  to  facilitate  building  systems  that  test  hypotheses  about  malicious  insider 
(MI)  activity,  to  detect  MI  activity  that  is  not  detectable  using  a  single  event  record, 
to  develop  a  “calculus  of  evidence,”  to  develop  metrics  for  comparing  and  weighting 
diverse  inputs,  and  to  determine  how  “this  fusion”  can  be  used  to  create  useful  syn¬ 
thetic/compound  events. 


Databases  Needed 

Breakout  sessions  considered  what  databases  would  aid  in  this  research  if  they  were  available. 
Researchers  need  databases  containing  examples  of  specific  attacks,  the  characterization  of 
normal  behavior  for  users  in  different  roles  (including  that  of  a  system  administrator),  and 
artificial  or  real  sensor  data  that  include  a  mix  of  legitimate  and  malicious  activity.  Potential 
sources  for  the  development  of  such  datasets  include  a  MITRE  dataset  of  normal,  and 
“insider  threat”  network  activities;  data  from  the  ARDA  NIMD4  study;  data  obtained  from 
use  of  the  Glass  Box5  software;  synthetically  generated  data  from  a  simulator;  and  individual 
datasets  developed  by  researchers  that  might  be  traded  among  projects. 


A  Concluding  Remark 

During  a  concluding  plenary  session,  a  senior  member  of  the  intelligence  community,  hear¬ 
ing  the  results  from  the  various  breakout  session  deliberations,  made  the  comment,  “What 
you’re  doing  is  important,  but  don’t  forget  that  IC  analysts  are  people,  too,  and  need  a  good 
work  environment  in  which  to  stay  motivated  in  their  stressful  jobs.  When  considering 
'observables’  and  sensors  and  other  means  of  keeping  track  of  the  activities  of  ‘insiders,’ 
please  ask  yourselves,  ‘Would  I  want  to  work  in  that  (resulting)  environment?”’  It’s  impor¬ 
tant  to  keep  this  in  mind,  in  the  research  enthusiasm  for  what  might  be  monitored,  and 
observed,  and  data-correlated.  We  must  strike  a  balance  between  effectiveness  in  thwarting 


4  See  Appendix  I  for  information  about  the  ARDA  “Novel  Intelligence  from  Massive  Data”  (NIMD)  research  thrust. 

5  See  Appendix  H  for  information  about  the  “Glass  Box”  research  effort. 
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insider  exploits  against  intelligence  assets  and  effectiveness  in  the  process  of  generating  and 
disseminating  that  intelligence  information  itself 
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ACS 

Automated  Case  System  (FBI) 

API 

Application  Program  Interface 

ARDA 

Advanced  Research  and  Development  Activity 

CD 

compact  disk 

CD-ROM 

compact  disk-read-only  memory 

COI 

community  of  interest 

COMINT 

communications  intelligence 

DCI 

Director  of  Central  Intelligence 

DIA 

Defense  Intelligence  Agency 

DoS 

denial  of  service 

EUID 

electronic  user  identification 

HUMINT 

human  intelligence 

H/W 

hardware 

IA 

information  assurance 

IC 

intelligence  community 

IR 

infrared 

LAN 

local  area  network 

MASINT 

measurement  and  signatures  intelligence 

MI 

malicious  insider 

NIMD 

Novel  Intelligence  from  Massive  Data 

NRM 

NSA  Reference  Model 

NSA 

National  Security  Agency 

NT 

Windows  NT  (operating  system) 

OS 

operating  system 

PBX 

private  branch  exchange  (telephone  control) 

PDA 

personal  digital  assistant 
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public  key  infrastructure 
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quality  of  service 
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RF 

radio  frequency 

RFID 

radio  frequency  identification 

RUID 

radio  frequency  user  identification 

SAM 

surface-to-air  missile 

SCIF 

secure  compartmented  information  facility 

s/w 

software 

tcpdump 

transmission  control  protocol  dump  (program) 

TS/SI 

top  secret/special  intelligence 

URL 

universal  resource  locator 

USB 

universal  serial  bus  (computer  port) 

WMD 

weapons  of  mass  destruction 

CHAPTER  ONE 

Introduction 


The  operations  and  analyses  of  the  United  States  intelligence  community  (IC)1  are  based 
heavily  on  a  set  of  information  systems  and  networks  containing  extremely  sensitive  informa¬ 
tion.  Most  observers  believe  that  the  greatest  threat  to  the  integrity,  confidentiality,  and 
accessibility  of  the  information  in  these  systems  is  the  “insider  threat.”2  This  phrase  usually 
refers  to  a  malicious  insider,  acting  either  alone  or  in  concert  with  someone  “on  the  outside” 
of  these  systems.  However,  one  should  also  consider  the  possibility  of  unintentional  actions 
by  an  insider  that  can  have  substantial  adverse  consequences  or  that  draw  attention  to  him¬ 
self  when  innocent. 

Discussions  of  the  “insider  threat”  raise  many  questions:  Who,  exactly,  is  an  insider? 
Anyone  with  physical  or  electronic  access  to  these  networks,  including  maintenance  and  cus¬ 
todial  personnel?  How  much  sophistication  (if  any)  does  it  take  to  compromise  the  informa¬ 
tion  within  these  systems?  What  defenses,  including  “indicators  and  warning,”  might  be 
instituted  to  guard  against  this  insider  threat? 

To  address  these  questions,  the  Information  Assurance  (LA)  research  thrust  of  the 
IC’s  Advanced  Research  and  Development  Activity  (ARDA)  held  a  workshop  on  March 
2—4,  2004.  Participants  included  ARDA  contractors  working  on  the  insider  threat  to  infor¬ 
mation  systems  and  members  of  the  U.S.  intelligence  community  with  knowledge  about  its 
systems  and  networks.  It  was  held  at  the  offices  of  McAfee  Security,  a  division  of  Network 
Associates,  Inc.,  in  Rockville,  MD.  The  stated  objectives  of  this  workshop  were: 


1  The  agencies  normally  considered  to  constitute  the  IC  are  the  office  of  the  Director  of  Central  Intelligence,  the  Commu¬ 
nity  Management  Staff,  the  National  Intelligence  Council,  a  set  of  Defense  Agencies  (Defense  Intelligence  Agency;  National 
Security  Agency;  National  Reconnaissance  Office;  Army  Intelligence;  Coast  Guard  Intelligence;  Navy  Intelligence;  Air 
Force  Intelligence;  Marine  Corp  Intelligence;  National  Geospatial-Intelligence  Agency — formerly  the  National  Imagery  and 
Mapping  Agency),  and  the  non-Defense  agencies  (Central  Intelligence  Agency;  Federal  Bureau  of  Investigation;  Advanced 
Research  and  Development  Activity;  and  portions  of  the  Department  of  Treasury,  Department  of  Energy,  and  Department 
of  State.) 

2  As  evidence  for  this  statement,  consider  the  following  excerpt  from  a  presentation  on  the  Robert  Hanssen  case  presented 
during  the  opening  plenary  session:  (1)  “Since  the  1930s,  every  U.S.  agency  involved  with  national  security  has  been 
penetrated  by  foreign  agents,  with  the  exception  of  the  U.S.  Coast  Guard”  (Webster  Commission,  2002);  (2)  1 17  American 
citizens  have  been  prosecuted  for  espionage  between  1945  and  1990  (or  there  is  clear  evidence  of  their  guilt).  Money 
appears  to  be  the  main  factor;  most  spies  volunteered  their  services.  Prominent  examples  of  insider  spies  include: 

•  Aldrich  Ame s,  CIA  counterintelligence  officer  (nine  years  as  spy) 

•  Ronald  Pelton>  former  intelligence  analyst  for  NSA 

•  Jonathan  Pollard \  military  intelligence  analyst,  gave  Israel  800  classified  documents,  1,000  cables 

•  John  Walker ,  retired  naval  officer,  with  son  and  brother,  supplied  the  Soviets  with  cryptographic  material. 
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•  To  generate  and  capture  domain  knowledge  that  will  benefit  the  broad  base  of 
researchers  studying  the  Insider  Threat.  This  includes,  but  is  not  limited  to,  knowl¬ 
edge  about: 

—  Inside  attacker  characteristics,  including  the  vulnerabilities  they  tend  to  exploit, 
and  the  attack  methods  they  use. 

—  Attack  characterization,  including  the  necessary  or  likely  preconditions  for  an 
attack,  the  observables  generated  during  an  attack,  and  the  effects  of  the  attack. 

—  The  electronic  network  and  application  systems  used  by  the  IC  for  document 
management,  including  the  mechanisms  used  to  protect  the  systems  and  data. 

—  IC  business  models  for  generating  and  controlling  access  to  documents. 

•  To  foster  cooperation  among  researchers  by  developing,  to  the  extent  it  is  practical, 
methods  for  describing  common  aspects  of  their  work,  such  as  event  characterization, 
attack  and  attacker  classification,  etc. 

•  To  focus  researchers  on  specific  systems  and  problems  of  interest  to  the  IC.  We 
expect  these  to  take  the  form  of  challenge  problems. 

As  can  be  seen  from  the  above  description,  researchers  investigating  means  to  counter 
the  insider  threat  formed  the  “target  audience”  for  the  workshop:  its  purpose  was  to  supply 
them  with  relevant  knowledge  about  the  workings  of  the  IC,  the  types  of  document  or 
information  processing  used  by  IC  analysts,  and  the  architecture  of  the  ICs  underlying 
information  networks. 

The  workshop  was  unclassified,  requiring  that  only  generic  information  about  some 
aspects  of  IC  information  processing  activities  were  transmitted  to  researchers.  The  intent 
throughout  the  planning  for  this  workshop  was  that  the  information  generated  (and  as  cap¬ 
tured  in  this  present  document)  should  be  widely  available  to  anyone  working  on  the  insider 
threat  problem,  without  restrictions. 

The  remainder  of  this  document  consists  of  the  results  of  the  deliberations  of  the 
four  breakout  groups.  (Those  results  were  originally  presented  to  the  workshop  on 
PowerPoint  charts;  they  have  been  converted  to  a  prose  form  for  greater  readability  and  uni¬ 
formity  of  presentation  in  these  proceedings.)  The  descriptions  and  charters  given  to  those 
breakout  groups  were  as  follows: 

•  IC  Systems  and  Business  Models  for  generating  and  controlling  access  to  docu¬ 
ments.  This  group  will  capture  core  knowledge  about  the  business  processes  and  the 
supporting  network  and  application  systems  used  by  the  IC  for  document  manage¬ 
ment  (creation,  update,  and  dissemination).  This  includes  the  physical,  procedural, 
and  technical  mechanisms  used  to  protect  the  systems,  services,  and  information. 
Since  the  systems  are  highly  heterogeneous,  with  different  processes  and  mechanisms, 
depending  on  specific  system  functions,  we  expect  this  group  will  generate  a  family  of 
system  models  reflecting  current  IC  systems  practices  and  anticipating  future  IC  sys¬ 
tems  and  practices. 

•  Vulnerabilities  and  Exploits.  This  group  will  collect  and  organize  knowledge  about 
the  ways  insiders  have  attacked  systems  in  the  past  and  the  ways  they  might  attack 
them  in  the  future.  The  group  will  identify  ways  that  insiders  have  exploited  techni¬ 
cal  and  procedural  vulnerabilities  in  the  past  to  compromise  classified  information  or 
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to  affect  the  integrity  of  critical  information.  The  group  will  identify  the  necessary  or 
likely  preconditions  for  an  attack,  the  observables  generated  prior  to  and  during  an 
attack,  and  the  effects  of  the  attack.  It  is  important  to  emphasize  that  the  charter  of 
this  group  goes  beyond  studying  and  describing  technical  exploits  that  an  attacker 
could  use  to  “hack”  the  system.  In  the  past  many  of  the  most  damaging  exploits  have 
resulted  from  legitimate  use  of  system  accesses  for  illegitimate  purposes. 

•  Attacker  Models.  This  group  will  direct  its  attention  to  identifying  and  understand¬ 
ing  the  relevant  behavioral  characteristics  of  inside  attackers.  Examples  of  these  char¬ 
acteristics  are  attacker  objectives,  level  of  system  knowledge  and  access,  level  of 
patience,  tolerance  for  detection  risk  or  attack  complexity,  social  engineering  skills, 
and  technical  capabilities.  The  working  group  will  not  concern  itself  with  underlying 
psychological,  political,  or  economic  factors  that  might  motivate  some  of  the 
attacker’s  behavioral  traits.  An  important  part  of  this  group’s  effort  will  be  to  under¬ 
stand  how  the  attacker’s  observable  behavior  can  be  used  to  identify  him  as  an 
attacker. 

•  Event  Characterization.  This  group  will  identify  the  key  elements  necessary  to  char¬ 
acterize  events  associated  with  insider  attacks,  to  facilitate  tracking  and  interpreting  a 
potential  insider  attacker’s  activities.  This  will  assist  researchers  who  are  trying  to 
integrate  input  from  a  variety  of  sensors  to  assess  the  likelihood  of  attacker  activity 
and  likely  attacker  intent.  It  will  also  help  sensor  researchers  know  what  capabilities 
to  include  in  the  sensors  they  define. 

The  appendices  contain  the  invitation  to  the  workshop,  the  agenda,  a  set  of  links  to 
relevant  “read-ahead”  material,  and  a  list  of  participants.  We  also  include  PowerPoint  charts 
used  in  the  following  plenary  presentations  made  by  members  of  the  intelligence  community 
and  their  contractors: 

•  The  Robert  Hanssen  Case:  An  Example  of  the  Insider  Threat  to  Sensitive  U.S.  Informa¬ 
tion  Systems ,  by  Robert  H.  Anderson,  RAND  Corporation 

•  An  overview  of  the  results  of  a  recent  ARDA  workshop  on  Cyber  Indications  and 
Warning,  by  Mark  Maybury,  MITRE  Corporation 

•  Intelink  Factoids ,  by  Peter  Jobusch,  Intelink  Management  Office 

•  Glass  Box  Analysis  Project,  by  Frank  L.  Greitzer,  Battelle,  Pacific  Northwest  Division 

•  Interacting  with  Information:  Novel  Intelligence  from  Massive  Data  (NIMD),  by  Lucy 
Nowell,  ARDA. 


CHAPTER  TWO 

1C  System  Models 


Relevant  Taxonomies 

This  breakout  group1  began  by  reviewing  a  set  of  taxonomies  developed  in  a  previous  ARDA 
“Indicators  and  Warning”  workshop  (see  Appendix  F),  shown  in  Figures  2.1  through  2.3. 
These  figures  list  a  set  of  “observables”  that  might  be  used  to  determine  abnormal  behavior 
of  an  insider  or  of  IC  documents,  “assets”  within  the  IC  that  might  be  tracked,  and  a  list  of 
the  different  categories  of  “users”  (insiders)  within  the  IC  community. 

Figure  2.1 
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1  Participants  were  Paul  Esposito,  Chris  Geib,  Joseph  Giampapa,  Alexander  Gibson,  Terrance  (TJ)  Goan,  Clarence  Jones, 
Jr.,  Linda  (Miki)  Kiyosaki,  Sara  Matzner,  Mark  Maybury,  James  Newton,  David  Sames,  and  Thomas  Shackelford. 
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Figure  2.2 
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Figure  2.3 
1C  Users  Taxonomy 
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The  remainder  of  the  group's  deliberations  then  concentrated  on  a  description  of  the 
intelligence  process  as  it  relates  to  a  document  life  cycle,  and  a  reminder  that  there  are  other 
systems  involved  to  be  considered:  policy,  personnel,  physical  security,  etc. 
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Definition  of  the  Term  "Document" 

The  group  developed  the  following  definition  of  “document”  to  be  used  in  describing  IC  sys¬ 
tem  and  process  models: 

•  Any  collected  artifact  that  is  used  to  convey  information. 

•  Ultimate  purpose  is  to  inform  decisions  at  various  levels: 

—  Strategic 

•  Military 

•  Legislative 

•  Political 
—  Tactical 

•  Can  be  electronic  or  physical 

•  Can  be  structured  or  unstructured 

•  Image,  voice,  text,  other 

•  Attributes 

—  Owned,  managed,  protected. 


Characterization  of  the  Intelligence  Process 

The  group  characterized  the  intelligence  production  process  in  terms  of  the  diagram  in  Fig¬ 
ure  2.4,  involving  requirements  generation,  collection,  processing  and  exploitation,  analysis 
and  production,  dissemination,  and  consumption,  with  various  feedback  loops. 

In  describing  the  terminology  in  Figure  2.4,  the  group  also  referred  to  the  “NSA 
Reference  Model”  (NRM),  which  comprises  the  following  steps: 

•  Signal,  data  information,  knowledge,  intelligence 

•  Collection  gathers  signals  and  data 

•  Processing  and  exploitation  transform  signals  and  data  into  information 

•  Analysis  and  production  transform  information  into  knowledge  and  intelligence 

•  Intelligence  is  consumed. 

The  following  subsections  describe  each  of  these  processes  in  more  detail. 

Figure  2.4 
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Requirement 

A  requirement  is  a  statement  of  need  by  a  consumer  in  the  form  of  a  formalized  request. 
There  are  two  types: 

•  Standing  requirement .  It  is  vetted  by  a  consensus  process,  formalized  by  a  memo, 
entered  into  a  database,  and  is  persistent. 

•  Ad  hoc  requirement.  Anyone  can  submit  an  ad  hoc  requirement  to  a  specific  agency.  It 
is  stored  in  a  database. 

Some  additional  attributes  of  requirements  are  that  they  involve  checks  and  balances , 
with  multiple  levels  of  vetting  (e.g.,  committee  meetings  of  analysts);  there  are  potential  vul¬ 
nerabilities  (e.g.,  the  ability  to  change  and  modify  requirements);  the  internal  threat  level  is 
high;  cyber  or  non-cyber  attacks  are  possible  on  the  database  of  requirements;  and  indications 
and  warning  of  attacks  on  requirements  could  be  derived  from  database  audit  logs. 

Collection 

The  collection  process  was  described  as  acquisition  of  raw  data,  which  may  include  construc¬ 
tion  of  new  systems  for  performing  the  acquisition. 

Its  major  components  are  all  the  “INTs”  (e.g.,  COMINT,  HUMINT,  MASINT). 
Additional  attributes  of  the  collection  process  include  the  existence  of  checks  and  bal¬ 
ances ,  such  as  attribution  of  the  source,  techniques  to  preserve  the  integrity  of  collected  data, 
and  legal  restrictions  derived  from  government  policy;  potential  vulnerabilities ,  such  as  degra¬ 
dation  of  collected  data  (including  integrity  issues),  intercepts  and  eavesdropping,  and  denial 
of  service  from  collecting  and  sending  back  what  was  collected;  the  insider  threat  level  is  con¬ 
sidered  to  be  “les  s  likely”;  and  the  potential  for  collusion  was  deemed  to  be  “none.” 

Processing  and  Exploitation 

The  group  defined  processing  and  exploitation  as  selecting,  filtering,  and  rendering  the  col¬ 
lected  information  into  a  human-usable  form.  It  involves  converting  technically  collected 
information  into  intelligence.  Filtering  is  often  involved  and  may  be  performed  by  individu¬ 
als,  software,  or  groups. 

Analysis  and  Production 

All  IC  member  organizations  perform  analysis.  Analysis  is  the  transformation  of  information 
to  knowledge.  Production  is  the  formalization  of  knowledge  into  a  document  or  product. 

The  major  components  of  analysis  and  production  were  listed  as 

•  office  automation  tools 

•  secure  document  management  systems 

•  specific  analysis  tools 

•  working  aids,  such  as  software  search,  visualization,  and  other  programs 

•  communities  of  interest  (COIs) 

•  other  networked  and  local  databases. 

The  process  involves  checks  and  balancesy  including  hierarchical  reviews  for  accuracy, 
consistency,  accountability,  attribution,  security  (e.g.,  assuring  that  proper  markings  are  pre- 
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sent);  collaborative  production  (not  necessarily  co-located);  a  coordination  process  involving 
interpretation  and  enforcement  of  policies;  and  authorizations. 

The  group  felt  that  the  analysis  and  production  process  had  potential  vulnerabilities , 
mainly  to  working  documents  and  software  programs  constituting  the  working  aids. 

Dissemination 

Dissemination  is  the  distribution  of  intelligence  to  the  requesters  and  authorized  consumers. 
The  group  concentrated  on  electronic  dissemination  methods,  with  the  following 
characteristics. 

Its  major  components  are  chat,  e-mail,  and  communities  of  interest.  It  uses  both  syn¬ 
chronous  and  asynchronous  notifications  and  partial  dissemination. 

The  group  listed  attributes  of  the  dissemination  process  as 

•  checks  and  balances  for  dissemination,  which  are  well-defined  for  paper-based  docu¬ 
ments  and  may  use  electronic  watermarks  for  trace-back 

•  potential  vulnerabilities  including  breach  of  confidentiality,  denial  of  access  to  distri¬ 
bution  lists,  and  distribution  beyond  intended  consumers 

•  both  the  internal  threat  level  and  collusion  level  were  considered  to  be  high 

•  both  cyber  and  non-cyber  means  of  dissemination  were  considered  to  have  vulner¬ 
abilities ;  cyber  dissemination  techniques  were  deemed  to  be  less  uniform 

•  indications  and  warning  are  to  be  considered  on  a  per-dissemination  channel  basis. 

Consumption 

Consumption  of  intelligence  is  the  use  of  produced  intelligence  by  authorized  users  to  sup¬ 
port  decisionmaking. 

The  main  components  of  consumption  (on  the  individual  level)  were  stated  by  the 
group  to  be  the  five  human  senses. 

Checks  and  balances  in  consumption  include  traditional  security  procedures  and 
enforcement  of  policy.  Potential  vulnerabilities  lurk  during  consumption,  with  the  possibility 
of  exfiltration,  leaks,  misuse,  misinterpretation,  and  withholding. 

An  important  research  topic  exists  within  the  consumption  process:  finding  ways  to 
inject  checks  and  balances  within  that  process  to  provide  observables. 

The  internal  threat  level  related  to  consumption  was  deemed  high;  the  collusion  poten¬ 
tial  was  deemed  medium. 

Although  much  dissemination  is  electronically  based,  consumption  tends  to  remain 
primarily  non-cyber,  using  low  tech  and  traditional  means  (although  the  role  of  chat  and  e- 
mail  is  growing). 

One  indicator  and  warning  of  misuse  of  the  consumption  process  is  that  restricted- 
dissemination  data  show  up  in  the  public  press. 


Definitions 

The  discussion  group  used  the  following  definitions  in  describing  the  intelligence  process: 
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•  Insider,  Anyone  with  access,  privilege,  or  knowledge  of  information  systems  and 
services2 

•  Malicious  insider  (MI):  Motivated  to  intentionally  adversely  impact  an  organization’s 
mission  (e.g.,  deny,  damage,  degrade,  destroy) 

•  Observable*.  Anything  that  can  be  detected  with  current  technology3 

•  Sensor  Measures  an  observable  (e.g.,  login,  print,  delete) 

•  Sensor  logs :  Recording  of  observables 

•  Sensor  stream :  Series  of  observables  from  a  sensor 

•  Indicator  Identifiable  event  based  on  sensor  output  logs 

•  Detect:  Determines  an  event  based  on  processing  of  indicators 

•  Report.  Indications  and  warnings  of  malicious  insider  behavior 

•  Incident.  Related  set  of  events 

•  Fusion:  Processing  multiple  sensor  outputs  to  provide  an  enhanced  result  (e.g.,  more 
abstract  or  more  concrete;  higher  confidence) 

•  Case:  One  or  more  incidents  that  share  common  attributes,  and  are  deemed  to  be 
(potentially)  related. 


Reference 

The  group  cited  Lowenthal  (2003)  as  a  reference  for  information  about  the  intelligence 
process. 


2  Note  that  we  do  not  say  “legitimate  access.”  Someone  (e.g.,  a  janitor,  a  service  technician)  may  be  given  access  accidentally 
or  inadvertently  but  nevertheless  have  access  to  certain  “insider”  privileges. 

3  Since  we  are  concerned  with  research  on  automated  detection  of  insider  threats,  we  do  not  include  here  observables  that 
are  only  human-detectable,  unless  that  observer  acts  like  a  sensor  and  records  the  observation  for  subsequent  processing 
steps. 


CHAPTER  THREE 

Vulnerabilities  and  Exploits 


Group  Focus 

This  breakout  group1  stated  that  their  focus  was  on  ways  insiders  attack  information  systems, 
including 

•  Preconditions  for  attack 

•  Observables 

•  Effects. 

The  group  included  illegitimate  use  of  authorized  access  and  focused  on  threats  and 
vulnerabilities  to  IC  networks.  Because  this  was  an  unclassified  workshop,  certain  vulner¬ 
abilities  may  be  known  at  a  classified  level  that  cannot  be  described  here;  they  will  be 
described  only  at  a  generic  level. 


Overview  of  Group  Deliberations 

The  group  decided  its  limited  time  was  best  spent  on  the  following  activities: 

•  Look  at  some  real  life  “war  stories”  about  insider  threats  to  critical  information  sys¬ 
tems 

•  Decompose  those  and  similar  events  to  determine 

— preconditions  (involving  both  physical  and  logical  access) 

—  observables  (that  could  have  been  used  to  thwart  the  attack) 

—  effects  (of  the  attack). 


"War  Stories" 

The  group  started  by  asking,  “Has  this  ever  happened  in  your  world?”  and  gave  these  as 
examples.  (The  contributors  vowed  that  all  have  been  seen  “in  practice”  in  the  real  world, 
except  for  the  second  one,  which  was  used  in  a  test  only.) 


1  Participants  were  Robert  Anderson,  Philip  Burns,  Matthew  Downey,  Jeremy  Epstein,  Dana  Foat,  Steve  Harp,  Dennis 
Heimbigner,  Kevin  Killourhy,  Vincent  Lee,  Mark  Morrison,  Mike  Pelican,  and  Brad  Wood. 
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•  An  insider  walks  into  the  secure  compartmented  information  facility  (SCIF),  pulls 
out  from  under  his  coat  a  freshly  burned  CD,  sticks  it  into  his  classified  workstation , 
and  then  selects  RUN. 

•  An  insider  locks  administrators  out  by  making  multiple  attempts  against  their  pass¬ 
words  until  the  system  locks  out  their  passwords. 

•  A  user  calls  the  help  desk:  “Hello,  this  is  Major  Smith — can  you  reset  my  password?” 
And  the  help  desk  doesn’t  verify  that  it’s  really  Major  Smith  who’s  calling. 

•  An  insider  finds  a  niffy  attack  on  the  Internet.  He  asks  himself,  “Gee,  I  wonder  if  this 
will  work  on  our  LAN?” 

•  The  business  portion  of  the  agency  was  sent  a  system  patch,  but  they  didn’t  give  it  to 
the  security  guys.  The  system  didn’t  get  patched. 

•  An  insider  modifies  a  valid  system  patch,  which  then  gets  distributed  to  the  whole 
“world”  of  that  agency  via  LiveUpdate . 

•  What  if  .  .  .  someone  modified  a  planning  database  to  change  the  coordinates  for 
SAM  sites  2.0  km  to  the  south,  and  make  them  SAM -2  instead  of  SAlM-5  missiles  (so 
that  they  were  perceived  to  have  shorter  range).  Pilots  would  get  shot  down.  All  this 
requires  is  access  to  a  database  in  Microsoft  Access,  Excel,  etc. 

•  A  malicious  insider  copies  a  TS/SI  file  from  his  classified  workstation  onto  a  USB 
port  “flash  drive,”2  moves  it  over  to  his  unclassified  system,  and  mails  it  out,  all 
within  the  same  office. 

•  Another  insider  installs  a  keystroke  logger  to  get  a  few  passwords  to  another  computer 
in  the  same  office. 

•  A  database  administrator  makes  an  extra  copy  of  the  database  files,  but  says  the  tapes 
are  bad.  He/she  then  carries  the  tapes  out,  and  no  one  is  the  wiser. 

•  An  insider  has  a  wireless  transceiver  in  his  unclassified  system,  to  transmit  files  after 
they  have  been  moved  from  his  classified  workstation  to  his  unclassified  one  (see 
“USB  flash  drive,”  above). 


Attack  Actions,  Observables,  Effects 

Having  “warmed  up”  on  the  above  examples,  the  group  then  attempted  to  develop  a  more 
complete  listing  of  “discrete  attack  actions”  (many  of  which  could  be  combined  in  various 
ways  into  more  complete  attack  scenarios).  In  creating  this  list,  the  group  used  the  following 
definitions: 

•  Attack  action:  Any  nefarious  activity  undertaken  by  an  adversary.  (It  does  not  have  to 
result  in  a  loss  of  confidentiality,  integrity,  or  availability.)  The  group  also  intends  to 
focus  on  “atomic”  attacks  that  would  be  part  of  a  larger  campaign. 

•  Observable :  Anything  that  could  be  detected  with  current  technology,  or  with  any 
other  technology  that  might  be  considered  possible.  (Note  that  this  definition 
extends  the  definition  cited  by  the  “IC  System  Models”  group  [Chapter  Two]). 


2  The  reference  is  to  a  very  small  keychain  device  that  plugs  into  a  computer’s  USB  port  and  acts  like  a  removable  disk.  At 
this  writing  (October  2004)  they  are  available  in  sizes  ranging  from  at  least  32  KB  to  four  gigabytes. 
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Table  3.1  at  the  end  of  this  chapter  contains  the  group's  expanded  list  of  “discrete 
attack  actions."  The  table  gives  a  name  for  each  of  the  33  attack  mini-scenarios  listed,  some 
scenario  details,  preconditions,  expected  observables,  and  likely  effects. 

Reviewing  the  attacks  listed  at  the  end  of  this  chapter,  the  group  made  the  following 
general  observations: 

•  Many  are  just  enablers  for  chains  of  attacks. 

•  Access  (either  logical  or  physical)  is  a  prerequisite  condition  for  all  attacks. 

•  Some  attacks  have  no  observables .  It's  an  important  research  question  to  consider  how 
that  can  be  fixed. 

In  discussing  the  list  of  33  attacks  in  a  plenary  session,  the  recommendation  was 
made  by  a  participant  that  this  list  should  have  an  added  column:  “Existing  remediation.” 
That  column  would  contain  information  on  what  measures  are  in  place  today,  in  various  IC 
enclaves,  to  thwart  the  attacks  listed.  We  recommend  this  as  a  useful  piece  of  additional 
research  to  be  performed.  There  was  insufficient  time  for  this  group  to  investigate  that  issue 
and  add  the  column  during  the  workshop  itself. 

Another  group  used  the  listing  in  Table  3.1  and  integrated  these  results  into  its  own 
taxonomy  (see  Figure  4.1 1  in  Chapter  Four,  “Attacker  Models”). 


Roles 

The  group  observed  that  the  attacker's  access  and  perspective  vary  depending  on  his  role  in 
the  enterprise.  Insiders  could  be  system  administrators,  users,  managers,  analysts,  linguists, 
“geeks”  (computer  specialists),  or  others. 

In  response  to  comments  from  the  larger  workshop  that  the  attacks  look  technical, 
this  group  responded,  “They’re  [the  attacks  are]  actually  brain-dead!”  The  group  emphasized 
that  although  the  attacks  may  look  sophisticated  to  a  “traditional”  analyst,  the  next  genera¬ 
tion  of  analysts  will  have  grown  up  with  computers,  cyber  games,  and  the  like,  and  all  this 
will  be  second-nature  to  them.  We  need  to  think  about  future  malicious  insiders  and  not  be 
overly  influenced  by  previous  attacks. 


Grand  Challenges 

Each  group  attempted  to  formulate  a  set  of  “grand  challenges”  for  research  in  discovering 
and  mitigating  the  insider  threat.  The  challenges  listed  by  the  “threats  and  vulnerabilities” 
group  were  the  following: 

•  Create  effective  deterrents  to  cyber  espionage. 

—  We  need  better  ways  to  enforce  and  monitor  the  deterrents  to  put  the  “fear  of 
God”  in  cleared  people. 

—  The  IC  may  not  even  need  real  monitoring;  just  the  perception  of  monitoring 
may  be  enough  (similarly,  some  states  use  cardboard  state  troopers  to  slow  down 
traffic). 
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•  Define  an  effective  way  of  monitoring  what  people  do  with  their  cyber  access,  for 
purposes  of  identifying  acts  of  cyber  espionage. 

—  Focus  on  detection,  not  prevention  (in  the  post-9 / 1 1  world,  we  need  to  allow 
everyone  access  to  “everything”;  instead,  develop  filters  to  find  the  nefarious  acts). 

•  We  need  policies  and  procedures  to  create  as  bright  a  line  as  possible  between  allowed 
and  disallowed  behaviors  to  reduce  the  ambiguity. 

—  If  the  rules  aren't  realistic,  then  they  dilute  the  overall  impression  of  enforcement. 
—  The  I C  should  therefore  adjust  the  rules  to  be  realistic  and  focus  on  what  is 
important  instead  of  trying  to  stop  all  disallowed  behaviors  equally. 

•  Consider  sociological/psychological  factors ,  and  create  better  cooperation  between 
information  systems  personnel  and  human  resources  personnel  (to  include  security, 
medical,  financial,  and  other  support  services).  In  short,  broaden  oversight  of  all 
aspects  of  a  user's  background  and  behaviors. 

—  Identify  precursors  to  changes  in  an  insider’s  “moral  compass” — can  this  be  mod¬ 
eled?  Focus  limited  resources  on  insiders  who  present  a  greater  risk. 

—  The  clearance  personnel  should  tell  the  cyber  personnel  who  the  risky  people  are 
(or  what  risky  behavior  is),  and  vice  versa. 

—  We  need  multidisciplinary  research  teams  (not  just  geeks)  investigating  what  we 
should  look  for  as  indicators  of  possibly  malevolent  behavior. 

Surprising  Lessons  Learned 

Each  group  was  asked,  “What  are  the  most  surprising  findings  that  came  out  of  your  delib¬ 
erations?”  This  group  answered: 

•  Espionage  case  history  does  not  cover  cyberspace.  Most  case  histories  do  not  involve 
interesting  cyber  exploits  that  we  know  could  be  used.  Looking  backward  at  case  his¬ 
tories  doesn’t  prepare  us  for  what  is  coming  with  a  more  computer-sawy  generation 
of  analysts. 

•  Things  are  looser  than  we  might  have  expected. 

—  Life  is  not  as  structured  on  the  IC  networks  as  we  thought. 

—  Policy  and  practice  aren’t  always  the  same  (fewer  people  are  searched,  even  spo¬ 
radically  and  randomly,  than  expected). 

—  Our  insiders  are  really  trusted. 

There  is  no  practical  way  to  prevent  exfiltration  by  even  a  moderately  determined 
adversary,  especially  given  modern  technology.  For  example,  USB  flash  drives  and  CD- 
ROMs  can  hold  huge  amounts  of  data  in  a  small  space  that  can  be  hidden. 

•  COTS  software  is  a  real  threat  developed  off-shore  by  uncleared  foreign  nationals. 

Datasets  Required 

Another  question  asked  of  all  groups  was,  “What  datasets  do  you  need  for  your  research?” 
This  group’s  responses  were: 
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•  The  MITRE  dataset  might  be  extended  to  be  more  useful. 

—  Example:  Record  more  things  at  the  host  level  and  more  things  outside  the  cyber 
domain  beyond  badge  logs  (e.g.,  where  people  are,  use  of  photocopiers,  phone 
records). 

—  “You  don’t  need  just  one  dataset — we  need  lots  of  them.” 

•  Data  from  ARDA  NIMD  study3  (recording  “normal”  analyst  activity)  seem  very 
promising. 

•  Enhance  the  NIMD  study  with  a  different  fictitious  set  of  insiders  (e.g.,  using  some 
of  the  attack  actions  listed  in  Table  3.1)  with  other  areas  of  interest  and  roles. 

—  Maybe  leverage  Glass  Box  software.4 


Measures  for  Success 

During  their  deliberations,  this  group  asked  themselves,  “What  are  the  measures  by  which 
we  can  judge  success?”  That  is,  how  can  we  know  that  we  have  been  successful  at  the  end  of 
the  workshop?  They  decided  on  two  criteria: 

•  We  have  identified  observables  that  have  not  yet  been  highlighted  by  researchers 

•  We  have  developed  a  list  of  “challenge  problems”  based  on  the  real  threat  to  IC 
information  systems. 


Based  on  those  criteria  and  the  material  in  this  chapter,  the  group  felt  its  delibera¬ 
tions  had  been  successful. 


Table  3.1 

Attack  Actions,  Preconditions,  Observables,  and  Effects 


Attack  Action 

Scenario  Details 

Preconditions 

Observables 

Effects 

1.  Virus-laden 

Malicious  insider  (Ml)  puts  viruses 

No  physical  checks 

Physical  observation  of 

Enabler  for 

CD  and/or 

or  other  malware  on  removable 

on  inbound  mate- 

media  movement; 

numerous  other 

USB  flash 
drive  and/or 
floppy 

media,  carries  it  into  an  IC  envi¬ 
ronment,  and  inserts  it  into  a 
system.  Depending  on  the  mal¬ 
ware  being  introduced,  it  may 
impact  confidentiality,  integrity, 
and/or  availability. 

rials;  physical  and 
logical  access  to 
machine;  media 
on  machine 

tamper  tape  over 
media  slots;  closed- 
circuit  camera,  NT 
event  log  will  show 
media  access  (but  big 
impact  on  perform¬ 
ance) 

attacks 

2.  Administra¬ 

Ml  finds  names  of  administrators, 

Login  access 

Log  entry  for  account 

Enabler  for 

tor  lockout 

and  then  tries  to  log  in  as  the 
administrator  (knowing  that 
after  some  number  of  failures 
the  admin  will  get  locked  out). 
Once  all  administrator  accounts 
are  locked  out,  the  Ml  can  per¬ 
form  attacks  knowing  that  the 
admin  is  unlikely  to  be  able  to 
log  in  and  detect  or  solve  the 
problem. 

(remote  or  local), 
no  multi-factor 
authentication, 
machine  set  to 
lock  out  after 
failed  login 
attempts;  names 
of  admin  user 
logins 

lockout;  log  entry  for 
admin  lockout- 
repeated  lockouts  of 
any  user  over  a  short 
period  of  time 

numerous  other 
attacks  plus 
availability 
attack  on 
SysAdmins 

3  See  Appendix  I  for  information  about  the  ARDA  “Novel  Intelligence  from  Massive  Data”  (NIMD)  research  thrust. 

4  See  Appendix  H  for  information  about  the  “Glass  Box”  research  effort. 
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Table  3.1 — Continued 

Attack  Action  Scenario  Details  Preconditions 

3.  Social  engi-  Ml  calls  up  the  help  desk  and  says  Help  desk  doesn't 
neer  pass-  "Hi,  I'm  Major  Smith,  I  forgot  my  have  a  way  to 

words  password."  Tries  to  convince  absolutely 

help  desk  to  reset  or  tell  him/her  authenticate 
the  password.  request 


4.  Retry  Inter-  Attack  scripts,  worms,  viruses,  etc.  Patches  not 
net  attacks  from  various  websites  in  cyber-  installed 

space  are  downloaded  and  exe-  promptly;  if 
cuted  on  local  secure  LAN  attack  is  trivial, 

network  access, 
or  if  not,  same  as 
virus-laden  CD 
or  USB  flash  drive 


Insufficient  physi¬ 
cal  checks  on 
outbound;  physi¬ 
cal  and  logical 
access  to 
machine;  media 
on  machine 

6.  "Missing"  A  laptop,  personal  digital  assistant  Insufficiently  pro¬ 
laptops/  (PDA)  or  other  device  is  removed  tected  data  on 

hardware  from  a  secure  facility;  the  Ml  can  machine;  physical 

gain  access  to  its  contents  access;  aperiodic 

inventory  checks 

7.  Targeted  Bad  guys  buy  surplus  equipment  Insufficient  saniti- 

acquisition  of  from  government  agencies  at  zation  process; 

surplus  auctions  (perhaps  tipped  off  by  physical  access  to 

equipment  insider),  and  search  disks  for  sen-  equipment 

sitive  information 

8.  Unpatched  An  insider  takes  advantage  of  Patches  available 

systems  knowledge  that  sensitive  info  but  not  installed 

systems  aren't  patched  promptly,  promptly 
and  uses  a  recent  attack  method 
to  gain  root  access  to  a  server 

9.  Sabotaged  Ml  alters  a  patch  to  be  dissemi-  Ability  to  alter 

patches  nated  to  all  LAN  systems  in  the  patches;  patch 

secure  facility,  enabling  a  trap-  distribution  sys- 

door  to  permit  greater  access;  tern 

that  patch  then  gets  installed 

automatically  on  all  systems 

within  the  enclave 

10.  False  posi-  Ml  creates  a  file  containing  the  Ability  to  create  a 

tives  on  anti-  signature  of  a  known  virus,  and  virus  signature 

virus  distributes  it  within  the  enclave.  (DoS  attack) 

Virus  detection  software  sends 
alerts  and  restricts  access,  caus¬ 
ing  denial  of  service 

1 1 .  Use  of  While  a  user  is  absent  from  his  Physical  access;  no 

unattended  terminal,  Ml  accesses  his/her  files,  timely  screen 

terminal  which  are  more  sensitive  and  saver/lockout- 

restricted  than  those  to  which  he  insufficient 
has  normal  access  authentication 

12.  Targeting  Ml  makes  subtle  changes  to  data  Database  access; 

database  within  a  key  (e.g.,  targeting)  knowledge  of 

"adjust-  database,  causing  inaccuracy  in  database  lay- 

ments"  subsequent  field  operations  out/semantics;  no 

integrity  checks 


5.  Smuggling  USB  flash  drive  on  keychain,  or 
out  flash  hidden  on  body  during  ingress 

drive  or  other  and  egress  to  controlled  area 
media 

(exfiltration) 


Observables  Effects 

Trouble  ticket  monitor-  Enabler  for 
mg;  integration  with  numerous  other 

badging  system  to  attacks  plus  con- 

detect  whether  in  fidentiality 

building  attack  against 

the  password 
itself 

Alien  software  installed  Enabler 
on  hosts;  intrusion 
detection  systems  that 
monitor  commands, 
system  calls,  URLs, 
etc.;  frequency  of 
patch  installation; 
patches  don't  fix 
problems  (hard  to  tell 
if  up  to  date) 

NT  event  log  shows  Confidentiality 
media  access;  physical 
check  on  egress 


RFID  on  hardware  Confidentiality  + 
devices  (but  also  helps  Availability 
adversaries);  failed 
physical  inventory 
check;  network  census 
failure 

"For  sale"  ad  on  eBay...  Confidentiality 
photo  with  classified 
sticker 


Time  interval  between  Enabler 
patching  of  operating 
systems,  applications, 
etc.;  vulnerability 
checkers 

Integrity  check  with  Enabler 
vendor  to  ensure 
patches  are 
unchanged 


Abnormally  high  num-  DoS/ Availability 
ber  of  virus  infection 
indicators  on  atypical 
files 


Abnormal  signature  Confidentiality/ 
from  terminal  Integrity/ 

Availability 


Transaction  log  indicat-  Integrity 
ing  change 
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Table  3.1 — Continued 


Attack  Action  Scenario  Details  Preconditions  Observables 


13.  Keystroke 
logger 


14.  Extra  copy 
of  DB  back¬ 
ups 


15.  Wireless 
transmissions 


16.  Cell 
phone/PDA/ 
voice  record¬ 
er  in  classified 
meeting 


17.  Suspicious 
activity  on 
real  systems 
(e.g.,  search¬ 
ing  own 
name  in 
databases) 

18.  Mislabeled 
paper 


19.  Netmeet- 
ing/WebEx 
controls 


20.  Zero  day 
attacks  based 
on  source 


Ml  installs  a  keystroke  logger 
hardware  or  software  on  an 
associate's  computer  within  the 
enclave;  later  retrieves  password 
information  from  it 

Ml  writes  a  backup  copy  of  a  sen¬ 
sitive  database  to  tape;  declares 
the  backup  or  tape  "bad,"  and 
reperforms  backup.  Removes 
extra  copy  of  backup  tape  from 
the  facility 

Ml  uses  a  wireless  PC  card,  USB 
device,  or  other  device  to  send 
wireless  signals  out  of  a  sensitive 
enclosure,  to  an  associate  or 
device  outside  the  facility 


Modern  cellphones  and  PDAs  are 
sufficiently  small  that  they  can 
be  brought  surreptitiously  (or 
even  accidentally)  into  a  secure 
facility  unnoticed;  they  can  be 
put  into  "record"  mode  to  cap¬ 
ture  voice  recordings  (e.g.,  into 
voice  message  files) 

Robert  Hanssen  periodically  que¬ 
ried  the  FBI  info  systems  for  his 
own  name,  plus  "dropbox"  and 
other  keywords,  to  see  if  he  was 
being  discovered 


Physical  or  logical 
access  (depending 
on  type  of 
logger);  may 
require  unmoni¬ 
tored  access 

Ability  to  make 
extra  backup, 
physical  access; 
no  monitored 
egress;  no 
accountability  on 
media 

Ability  to  install  or 
enable  wireless; 
small  wireless 
transceiver;  enclo¬ 
sure  isn't  suffi¬ 
ciently  shielded 
(IR,  802.11, 
other?) 

Insufficient  physi¬ 
cal  monitoring; 
insufficient 
shielding;  insuffi¬ 
cient  RF  monitor¬ 
ing 


Physical  tamper  indica¬ 
tor  for  physical  log¬ 
ger;  Tripwire-type 
software 


Backup  log  (or  change 
to  backup  log),  #  of 
tapes  used,  destruc¬ 
tion  evidence  (not 
logs) 


Picking  up  wireless  sig¬ 
nals  (RF),  some  optical 
techniques  to  detect 
IR  (monitoring  IR  in  an 
enclave  might  be 
enhanced  by  adding 
reflectors) 

Metal/device  detector 
in/out  of  SCIF;  RF 
detector  in  facility 


No  monitoring  of  "Adversary  activity" 
activity  (capture  detected  in  audit  logs 
and  analysis  of 
audit  trails) 


A  sensitive  document  has  its 
header  and  foot  labels  changed 
(e.g.,  by  using  a  word  processor) 
to  indicate  a  lesser  (or  no) 
degree  of  classification  or  sensi¬ 
tivity,  is  then  printed 

Software  such  as  "Netmeeting," 
"Web Ex,"  or  "Timbuktu"  allows 
one  user's  display  to  be  linked 
to,  and  slaved  onto,  another's. 
Can  this  be  done  surreptitiously 
without  the  first  person's  knowl¬ 
edge  (e.g.,  when  they're  away 
from  their  terminal)?  If  so,  all 
documents,  e-mail,  etc.  displayed 
on  the  first  terminal  are  mirrored 
onto  the  second  one 

Major  portions  of  source  code  for 
an  earlier  version  of  the  Micro¬ 
soft  Windows  operating  system 
have  become  available  on  the 
Web.  An  Ml  uses  knowledge 
gained  from  the  source  to  craft  a 
new  attack,  and  uses  it  within 
the  secure  enclave.  It  is  called 
"zero  day"  because  it  has  never 
been  used  before,  and  hence  no 
monitors  are  configured  to 
detect  it 


Possible  to  change 
markings  on 
page;  reliance  on 
page  markings  as 
authorization  for 
removal 

Existence  and 
default  enable¬ 
ment  of  tools 
within  enclave 


Disclosure/knowl¬ 
edge  of  source 
code;  tools  to 
search  for  vul¬ 
nerabilities; 
assumption  that 
obscurity  is  sig¬ 
nificant  source  of 
security 


No  cyber  observable! 
(maybe  some  cyber 
observable  if  there  are 
highly  instrumented 
applications/OS-level) 

Test  if  tool 

installed/enabled;  test 
if  tool  active;  indicate 
who  is  using  tool 
(attribution).  Deter¬ 
mine  what  traffic  is 
being  exchanged 
using  that  tool 


No  direct  cyber  observ¬ 
ables;  some  derivative 
observables  based  on 
undesirable  applica¬ 
tions  behavior; 
knowledge  that 
source  code  has  been 
disclosed 


Effects 


Confidentiality  + 
Enabler 


Confidentiality 


Confidentiality 


Confidentiality 


May  implicate 
adversary 


Confidentiality 


Enabler 


Enabler 
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Table  3.1 — continued 

Attack  Action  Scenario  Details  Preconditions  Observables  Effects 

An  Ml  hides  sensitive  information  Technology  to  Information  content  Confidentiality 
within  a  picture  or  other  digital  allow  encoding  analysis  results 
unclassified  document,  and  and  decoding 

removes  it  from  the  secure  facil-  (including  ability 
ity  to  install  soft¬ 

ware) 

22.  Copy  and  An  Ml  uses  "copy  "and  "paste"  Ability  to  copy  and 

paste  between  two  windows  on  his/her  paste  between 

between  clas-  PC,  allowing  one  or  more  classification  lev- 

sifications  paragraphs  from  a  highly  sensi-  els 

(from  high  to  tive  document  to  be  placed  in  a 
low)  less-sensitive  one,  then  removes 

the  latter  document 

23.  Internal  e-  Ml  creates  and  sends  e-mail  within  E-mail  agent  proc-  E-mail  download  with  Enabler 

mail  that  the  secure  facility  that  has  esses  scripting  content;  executable 

performs  JavaScript  or  an  attachment  with  without  user  content;  indication  of 

attacks  content  that  installs  trapdoors  or  approval  or  whether  the  con- 

creates  other  vulnerabilities  knowledge  tent/execution  parsing 

agree;  indication  of 
whether  the  parsed 
content  is  nefarious 
(hard  problem) 

24.  Wireless  Installs  a  small,  wireless  video-  Ability  to  install  RF  emissions 

cameras  to  camera  in  the  ceiling  above  camera  and  exfil- 

shoulder  surf  another  user's  workstation  to  tration  (via  wire- 

capture  keystrokes  and  obtain  less  or  media) 

passwords 

25.  Telephone  A  telephone  tap  is  installed  within  Plain-text  voice  None 

tap  recording  a  secure  facility  that  records  transmission; 

onto  information  on  a  CompactFlash  access  to  telco 

removable  card  or  other  removable  medium  media 
media 

26.  Telephone  Modern  PBXs  are  computers.  They  Plain-text  voice  Depends  Confidentiality 

tap  via  hack-  can  be  "hacked"  via  default  transmission; 

ing  PBX  maintenance  passwords  or  other  access  to  telco 

means,  to  gain  access  to  tele-  media;  physical  or 
phone  conversations  logical  access  to 

PBX/distribution 

27.  Analyst  Ml  surreptitiously  changes  dis-  Analyst  ability  to  Actual  product 

changes  semination  instructions  (e.g.,  in  change/deter-  workflow 

workflow  to  software)  to  exclude  some  ana-  mine  workflow 
exclude  other  lysts  from  obtaining  information  for  intelligence 
analysts  (dis-  they  need,  thereby  harming  the  product  (man- 
semination)  resulting  intelligence  product  agement  author¬ 

ity) 

Ml  surreptitiously  adds  him/  Analyst  ability  to  Actual  product  Integrity 

herself  to  dissemination  instruc-  change/deter-  workflow 

tions  to  become  aware  of  sen-  mine  workflow 
sitive  information  without  need  for  intelligence 

to  know  product  (man¬ 

agement  author¬ 
ity) 

29.  Insert  bad  Mlinvolved  with  the  creation  of  Translation  author-  Inspection  of  product;  Integrity 

content  into  intelligence  information  (e.g.,  ity  (ability  to  ere-  comparison  of  prod- 

report  upon  during  a  translation  process)  ate  source  report)  uct  to  raw  data 

inception  alters  its  content,  thereby 

(e.g.,  trans-  harming  the  intelligence  product 

lation) 

30.  Delete/  Ml  involved  with  the  creation  of  Translation  author-  None  (probably);  com-  Integrity 

withhold  intelligence  information  (e.g.,  ity  (ability  to  ere-  parison  to  redundant 

content  into  during  a  translation  process)  ate  source  report)  report 

report  upon  deletes  key  portions  of  its  con- 
inception  tent,  thereby  harming  the  result¬ 

ing  intelligence  product 


28.  Analyst 
changes 
workflow  to 
include  him¬ 
self/herself 


Integrity/Avail¬ 

ability 


Confidentiality 


Confidentiality 


Applications/OS  Confidentiality 

instrumentation  only 
(difficult  problem) 


21.  Covert 
channels 
through 
steganogra- 
phy 
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Table  3.1 — continued 


Attack  Action 

Scenario  details 

Preconditions 

Observables 

Effects 

31.  Redirect 

Ml  obtains  access  to  software  or 

Analyst  ability  to 

Collection  does  not 

Integrity/ 

analyst 
resources  to 
support  some 
agenda 

other  procedures  by  which  ana¬ 
lyst  resources  are  allocated,  and 
redirects  them  for  own  purposes 

change/deter¬ 
mine  workflow 
for  intelligence 
product  (man¬ 
agement  author¬ 
ity) 

meet  exact  require¬ 
ment 

Availability 

32.  Poor  qual¬ 
ity  analysis/ 
results/ 
reports 

Ml  involved  in  intelligence  analy¬ 
sis  deliberately  creates  faulty  or 
misleading  reports 

Translation  author¬ 
ity  (ability  to  cre¬ 
ate  source  report) 

None/few;  customer 
feedback;  job  per¬ 
formance  reviews 

Integrity 

33.  Get  1C  asset 
to  collect 
information 
that  benefits 
an  unautho¬ 
rized  party 

Ml  obtains  access  to  the  process 
by  which  1C  assets  are  tasked, 
and  alters  the  tasking  so  that 
information  of  use  to  the  Ml  (but 
not  the  U.S.  intelligence  effort)  is 
obtained 

Tasking  authority 

Lack  of  requirement  for  Availability 
the  collection  activity. 

CHAPTER  FOUR 

Attacker  Models 


Group  Focus 

This  breakout  group1  began  its  deliberations  by  stating  some  broader  concerns: 

•  Are  attacker  models  providing  too  much  information  to  the  bad  guys  on  what  we 
look  for?2 

•  Can  individual-level  activity  monitoring  cause  the  malicious  insider  to  change  but 
not  limit  his  activity  while  causing  the  benign  insider  to  restrict  his  creativity? 

•  How  should  we  reconcile  the  need  to  collaborate  more  with  better  security? 

The  group  then  listed  the  following  more  specific  questions  about  modeling  attackers 
as  important  items  to  discuss: 

•  What  are  the  defining  characteristics  and  reasonable  values  for  attacker  models? 

•  What  are  the  observables  (actions,  artifacts)  that  an  insider  generates  and  how  can  they 
be  used  to  determine  benign  or  malicious  behavior? 

•  How  could  an  attacker  “cover  his  trail”  of  observables? 

•  What  is  the  overall  scope  of  activities  that  a  malicious  insider  might  undertake  and 
what  role  do  environmental  factors  play? 

•  Can  we  create  a  taxonomy  of  security  controls,  and  how  can  we  correlate  these  con¬ 
trols  with  detected  observables  to  tune  and  manage  responses? 

•  How  can  we  generate  models  of  normal  behavior  as  well  as  malicious  behavior,  and 
how  can  we  generate  data  to  test  the  models? 

•  How  can  we  find  signs  of  malicious  insider  behavior  in  unusual  environments,  such 
as  high-performance  machines  or  massive  visualization  data  transport? 

•  How  do  insiders  seek  information  and  how  can  researchers  use  context  to  determine 
if  the  information  seeking  is  benign  or  malicious? 

1  Participants  were  Matthew  Downey,  Tom  Haigh,  Steve  Karty,  Van  Lepthien,  Rich  Neely,  Greg  Stephens,  Frank  Greitzer, 
Thomas  Hetherington,  Stephen  Laird,  Tom  Longstaff,  Marisa  Reddy,  and  Edward  Wright. 

2  This  was  one  of  the  few  instances  during  the  workshop  in  which  counterintelligence  effects  of  insider  threat  research  were 
explicitly  highlighted. 
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A  First-Cut  Notional  Insider  Model 

Figure  4. 1  represents  a  top-level  view  of  an  insider  attacker  model  used  by  the  group. 

The  group  then  compared  this  model  to  the  known  Robert  Hanssen  case  history  (see 
Appendix  E  for  details).  The  Hanssen  case  was  summarized  as  shown  in  Figure  4.2. 

The  group  then  elaborated  the  attacker  model  into  the  “spiral  model”  shown  in  Fig¬ 
ure  4.3. 

Figure  4.1 

Notional  Insider  Model 


Figure  4.2 

Hanssen  Case  History 

1 .  Pre-violation  -  Granted  Access 

2.  T rigger  -  Financial  Status 

3.  Spiral  Model  (zero  or  more  occurrences  of  each  step, 

one  or  more  events  per  attack)  -v 

1 .  Identify  Consumer  /  Recruitment  /  Motivation  | 

1 .  Search  -  Known  (OJT) 

2.  Contact  -  Approached 

1.  Anonymous 

2.  Attributed 

3.  Negotiation  -  Handler  /  Instruction 

2.  Identify  Asset 

3.  Look  for  Detection  (assess  risk) 

4.  Obtain  Asset 

5.  Exfiltrate  /  Delivery 

6.  Cover  Tracks 

7.  Collect  Payment  /  Reward  /  Satisfaction  J 

4.  Exit 
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A  user  “case”  was  then  abstracted  into  a  set  of  actions  occurring  over  time,  as  shown  in  Fig¬ 
ure  4.4. 

Figure  4.3 

Spiral  Model  Flowchart 


Figure  4.4 

Insider  Attack  "Case"  Actions  Over  Time 
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In  contrast,  normal  user  actions  within  an  IC  environment  were  diagrammed  as  in 
Figure  4.5. 

The  normal  model  (Figure  4.5)  contains  “base”  behavior,  and  will  continue  while 
hostile  activity  is  being  done.  It  should  be  noted  that  normal  behavior  will  be  incorporated 
into  most  attacks.  Normal  behavior  has  analogues  in  hostile  behavior — distinguishing  the 
two  can  be  a  very  hard  problem. 

Given  the  attack  model  in  Figure  4.4,  which  of  those  actions  generate  “observables” 
that  might  be  detected?  Figure  4.6  shows  that  same  model,  with  unobservable  actions  in 
white. 

Another  view  of  an  attacker  model  is  that  the  insider  has  certain  attributes  that  are 
perhaps  measurable.  Attacks  have  certain  observables,  and  the  type  of  outcome  can  be  placed 
in  several  categories.  Figure  4.7  lists  the  attributes  and  categories  itemized  by  this  group’s 
deliberations. 

The  group  then  created  a  modified  taxonomy  of  insider  actions,  listing  a  number  of 
possible  actions  that  might  be  taken  in  the  following  categories: 

•  obtaining  access 

•  reconnaissance 

•  entrenchment  and  exploitation 

•  extraction  and  exfiltration 


Figure  4.5 

Normal  Insider  Actions 
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Insider  Attack  Actions 
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Figure  4.7 

Top-Level  View  of  Model 


Insider 

(metrics) 

•  Knowledge,  Skill, 
Ability 

•  Motivation 

•  Moral  Compass 

•  Level  of  Access 

•  Personality 

-  Social 
Engineering 

-  Risk  Tolerance 

•  Environment 


Attacks 

(observables) 

•  Technology,  Risk, 
Reward 

(increasing  over 
time?) 

•  Sophistication 

-  Low 

•  Remove 
print  /  media 

-  Medium 

•  Masqueraded 
media 

-  High 

•  Steganography 


•  communication 

•  manipulation 

•  counterintelligence. 


Outcome 

(damage) 

•  Leak 

•  Source 
Identification 

•  Mis¬ 
information 

•  Sabotage 

•  (Normal) 


It  could  also  include  other  cyber-related  activities.  The  group  indicated  the  level  of 
sophistication  for  each  action:  low,  medium,  or  high. 

The  group  then  compared  this  taxonomy  with  the  set  of  “attack  actions”  produced 
by  the  “Vulnerabilities  and  Exploits”  group  (listed  in  Table  3.1),  resulting  in  the  listing 
shown  in  Figure  4.8.  In  this  figure,  row  numbers  from  Table  3.1  are  shown  for  all  exploits 
from  that  table.  Exploits  that  do  not  appear  in  Table  3.1  are  italicized,  and  seven  exploits 
(numbers  7,  16,  19,  20,  26,  27,  33)  in  Table  3.1  did  not  have  a  place  in  this  group's  taxon¬ 
omy.3  The  parenthesized  suffix  letters  C,  I,  A,  E  in  Figure  4.8  indicate  whether  the  actions 
would  lead  to  a  breach  of  information  Confidentiality,  Integrity,  Availability,  or  would  be  an 
Enabler  of  other  attacks. 


Definitions 

As  mentioned  earlier,  each  group  was  asked  to  be  clear  about  the  definitions  of  key  terms  it 
used.  This  group  listed  the  following  definitions,  which  cover  a  number  of  terms  not  defined 
by  the  IC  System  Models  group  (Chapter  Two): 

•  Threap.  A  potential  for  deliberate  attack  or  an  inadvertent  compromise  of  an  organi¬ 
zation's  mission 

•  Attack :  A  deliberate  attempt  to  compromise  an  organization's  mission 

•  Case:  All  events  and  states  in  the  world  associated  with  a  related  set  of  attacks.  A  case 
may  also  be  a  set  of  other  cases. 


3  It  would  have  been  desirable  to  resolve  such  mismatches  among  the  various  taxonomies  used  by  different  breakout  groups, 
but  the  workshop  schedule  did  not  permit  it.  Reconciliation  of  the  various  definitions  and  taxonomies  in  this  report  would 
be  a  useful  follow-on  research  activity. 


26  Understanding  the  Insider  Threat:  Proceedings  of  a  March  2004  Workshop 


Figure  4.8 

Insider  Actions  Taxonomy  Cross-Referenced  with  Vulnerabilities  and  Exploits  List 


Entrenchment  Extraction  & 
Access  Reconnaissance  &  Exploitation  Exfiltration 


Web/ file  browsing a 
DB  searches ^ 
Unusual  searching  (17)^ 
Scanning  (stealthy)  ^ 

►  A  uthorized  account 
Orphan  account 

^  Unlocked,  unattended 
tenninals  (1 1  CIA) 

Ph  ysical  (pick  up  printout) 
w  Accidental  /  Incidental 
^  Document  control 
^  Safe  storage 

►  Two  party  rule 

►  Social  engineering  (3CE) 

^  Shoulder  surfing  (24) 

^  Password  guessing  (3CE) 


►  Download,  media  import,  or 
email  virus  /  trojan  (IE,  23 A) 

^Keystroke  logger (13CE) 
Import  published  attack  (4E, 
*8E) 

*  Install  unauthorized  software 

►  (sensor,  bot,  ...) 

►  Sabotage  patch  system  (9E) 

►  Replace  device  drivers  / 
analysis  tools 


Printing  /  Copy  machine  4 
Removable  media  (5C,  25C)^ 
Manual  classification  downgrade 
(18C,  22C)^ 
Downloads 

K,  .  ,  .  ....  Masqueraded  media< 

Need-to-know  violations  (281  v 

Downgrading  classification  (18Ch 

Stolen  laptop  (6CA)^ 
Wireless  usage  (15C)^ 
Stcganography  (2 1 C) 
Duplicate  db/log  file  backup  (14C)"* 


File  permissions 

*  Password  cracking 

*  Privilege  escalation 


Missing:  7, 16,  19,20,  26,  27,  33 

V&E  row  #  in  ();  no  match  in  V&E 


Communication  Manipulation 

j— ^Standard  encrypted  email 
> Simple  coded  messages 
,  Wireless  usage  (15C) 

Custom  encrypted  email 
Steganography  (2 1C) 

"Covert  Channels  (2 1C) 


Altering  authorized  informations 
(291,  301,  321) 
Upgrading  classification ^ 
Database  modification  (121) 
Corrupt  protections-virus  (10 A) 
Corrupt  infrastructure^ 
(23,28,3 1,321)-* 


Counter  Other  Cyber 
[ntelligence  Activities 

— ►  “ Look  over  shoulder” 
Cover  story 

^  Unusual  file  deletion 
Block  admin  access  (2AE) 
“^Search  Cl  case  files  (17) 
Disk/file  erase/wipe 
—► Modify'  Cl  case  files 
Modify  audit  logs 
^  Normal  drift 
^  Replace  device  drivers  / 
analysis  tools 


Pornography  <4—1 
Gambling  ^  I 


Sophistication: 

Low  -  Work  entirely  within  the  normal 
confines  of  the  existing  system 

Medium  -  Push  the  limits  of  the  existing 
system  (“bend  but  don’t  break”) 

High  -  Use  tools  /  technology  to  break  the 
existing  system 


•  Event :  Something  that  happens  in  the  world.  There  are  atomic  events,  and  logical 
collections  of  events  can  also  be  events. 

•  Model:  An  abstract  representation  of  some  portion  of  the  world  (examples:  sensor 
model,  attack  model,  user  model,  infrastructure  model) 

•  Observable :  Amy  event  or  state  element  that  can  (or  could)  be  measured 

•  Sensor.  Measures  an  observable.  Some  sensors  include  Level  1  fusion,  and  so  may  also 
generate  a  detection. 

•  Observation :  The  output  of  a  sensor.  May  be  a  raw  signal,  a  detection,  or  both. 

•  Detection :  A  decision  based  on  processing  of  observations  (or  collections  of  observa¬ 
tions) 

•  Fusion  Level  0:  Raw  (signal)  output  from  a  sensor 

•  Fusion  Level  1:  Processed  (analyzed)  sensor  output,  often  includes  an  initial  detection. 


Grand  Challenges — Research  Issues 

The  group  stated  as  its  overall  objective,  “We  are  not  building  a  giant  Intrusion  Detection 
System;  we  are  trying  to  build  a  set  of  tools  to  produce  indicators  to  help  the  organization 
investigate  anomalous  patterns  of  behavior.  We  want  to  shorten  the  time  from  defection  to 
detection.” 

With  this  in  mind,  the  group  listed  the  following  as  research  issues: 

•  Extend  the  taxonomy  of  insider  actions  to  include,  among  other  things, 
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—  mappings  from  user  roles  to  expected  actions  and  to  permitted  actions.  (The  first 
helps  characterize  normal  behavior;  the  differences  between  the  two  mappings 
help  characterize  malicious  behavior.) 

—  mappings  from  actions  to  observables  and  observables  to  sensors.  (This  should 
include  both  existing  sensors  and  desirable  new  sensors.  One  approach  would  be 
to  work  with  Cl  experts  to  identify  sensors  that  would  detect  the  events  and 
extract  the  information  that  they  use.) 

•  Develop  a  characterization  of  normal  system  administrator  behavior,  possibly  using 
Glass  Box  software  (see  Appendix  H). 

•  Develop  techniques  for  identifying  triggers  that  distinguish  between  normal  and 
malicious  behavior. 

•  Develop  techniques  to  identify  multiple  cases  with  a  common  objective. 

•  Build  a  library  of  attacks  (scenarios)  that  researchers  can  use  to  train  and  test  their 
anomaly  detection  systems. 

The  first  research  issue  is  related  to  the  question  of  what  datasets  the  researchers  need 
to  pursue  their  research.  Some  researchers  said  they  could  use  a  realistic  document  corpus, 
possibly  the  NIMD  (see  Appendix  I)  WMD  corpus.  There  were  several  researchers  working 
on  methods  for  anomaly  detection,  and  they  all  agreed  that  they  need  data  having  sets  of 
attacks  interleaved  with  normal  user  activity.  They  need  preliminary  datasets  to  train  their 
anomaly  detection  systems  and  more  sophisticated  datasets  to  validate  and  tune  their  sys¬ 
tems. 

There  were  five  proposals  for  how  to  generate  the  datasets,  listed  below  in  approxi¬ 
mate  order  of  preference: 

A.  The  existing  data  from  the  MITRE  workshop  might  be  adequate  as  a  starter  set. 

B.  It  might  be  possible  to  use  existing  Glass  Box  data  as  a  source  of  normal  activity.  It 
would  be  good  to  find  a  way  to  inject  attack  data,  from  the  library  generated  above, 
into  these  existing  data. 

C.  An  alternative  would  be  to  seed  the  Glass  Boxed  environment  with  some  attackers 
in  the  future.  These  attackers  could  mount  attacks  from  the  library  while  other 
users  are  going  about  their  normal  activities.  Glass  Box  could  collect  the  data  and 
provide  them  to  the  researchers. 

Both  B  and  C  depend  on  a  positive  assessment  of  Glass  Box  data  by  the  researchers. 
Right  now,  it  is  not  known  for  sure  if  Glass  Box  provides  what  they  need,  but  participants 
are  hopeful. 

D.  It  would  be  possible  to  develop  a  simulator  to  generate  the  required  datasets.  The 
initial  datasets  could  be  fairly  simple,  requiring  a  simple  simulator.  The  hope  would 
be  that  the  simulator’s  capabilities  would  grow  in  a  manner  and  at  a  rate  compatible 
with  the  expanding  needs  of  the  researchers. 

E.  If  none  of  the  above  worked  out,  it  would  be  possible  for  the  researchers  to  share 
the  datasets  they  create  for  themselves. 
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Whichever  approaches  are  chosen,  it  will  be  necessary  to  define  the  information 
needed  and  a  data  schema  and  format  so  that  the  researchers  can  share  the  datasets. 


Surprising  Lessons  Learned 

In  concluding  its  deliberations,  the  group  listed  the  most  surprising  outcomes  from  its  dis¬ 
cussions  as  the  following: 

•  Workshop  participants  still  don’t  have  a  good  understanding  of  what  is  important  to 
look  for  as  opposed  to  what  is  not  (but  perhaps  they  just  didn’t  focus  attention  on 
this  issue). 

•  They  became  aware  of  the  availability  of  NIMD  data  and  tools,  which  are  seemingly 
relevant  and  were  previously  unknown  to  them. 

•  The  group  was  struck  by  the  lack  of  clear  event  models. 


CHAPTER  FIVE 

Event  Characterization 


Terminology 

This  breakout  group1  started  by  asking,  “What  is  an  eventT 

•  An  event  is  an  occurrence  in  a  system  that  is  directly  observable. 

•  It  does  not  reflect  judgment  or  interpretation  (by  contrast  with  detection). 

•  It  is  discrete  (atomic)  at  the  chosen  level  of  abstraction. 

•  Events  are  composable  into  higher-level  events. 

•  Inline  or  mediating  sensors  are  useful  for  exposing  events  in  real-time. 

•  Events  can  be  characterized  by  data  of  varying  degrees  of  fidelity. 

Considering  event  modeling,  there  are  two  kinds  of  models:  models  characterizing  expected 
(normal)  behavior  and  malevolent  insider  (MI)  behavior  models.  A  significant  event  fits  or 
deviates  from  one  of  these  models.  That  raises  the  following  questions: 

•  How  should  we  characterize  events  so  that  we  can  determine  that  they  are  significant? 

•  What  information  needs  to  be  captured  when  an  event  occurs  so  that  we  can  deter¬ 
mine  if  the  event  fits  or  deviates  from  a  model? 

The  set  of  relevant  events  is  influenced  by  (but  should  not  be  limited  by)  the  models  being  fit. 


Events — Considerations 

The  group  then  discussed  host-based  events  as  having  the  following  characteristics:2 

•  Action  by  a  subject  involving  an  object 

•  Action  is  then  defined  as  request  for  or  the  result  of  a  system  service  or  high-level 
application  activity 

—  Can  be  high  or  low-level 

—  Characterized  by  time,  arguments,  return  value,  status,  etc. 


1  Participants  were  Bruce  Gabrielson,  Greg  Kipper,  Elizabeth  Liddy,  Roy  Maxion,  Kymie  Tan,  Lisa  Yanguas,  Dennis 
Heimbigner,  Scott  Lewandowski,  David  Mankins,  T.  J.  Smith,  and  Feiyi  Wang. 

2  The  general  issue  of  event  characterization  has  been  studied  by  others.  The  group  used  (but  was  not  limited  to)  past  work 
in  this  area  by  Price  (1997)  and  Doyle  et  al.  (2001). 
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•  Subject  is  defined  as  the  user  and  the  process  acting  on  behalf  of  the  user 

—  Characterized  by  user  name,  identifiers  (e.g.,  electronic  user  identification 
[EUID],  radio  frequency  user  identification  [RUID],  impersonation  token,  etc.), 
group  membership  identifiers,  process  identifiers,  terminal  identifiers,  etc. 

•  Object  is  considered  to  be:  a  protected  system  resource 
—  Can  be  physical  or  logical 

—  Characterized  by  names  and/or  identifiers,  type,  access  permissions,  locations, 
owner,  etc.  (information  available  varies  greatly) 

•  If  an  action  changes  the  object’s  attributes,  the  old  and  new  information 
should  be  recorded. 

With  this  terminology  in  hand,  the  question  then  becomes,  “How  can  it  be  determined  if  an 
event  is  significant?”  The  group  believed  that  for  each  event,  there  are  two  hypotheses  to  be 
tested: 


•  The  event  is  the  result  of/indicator  of/caused  by  the  activities  of  a  malicious  insider 
(MI),  or 

•  The  event  is  not  the  result  of/indicator  of/caused  by  the  activities  of  an  MI. 

What  can  be  observed  to  test  these  hypotheses?  The  group  listed  four  possibilities: 

•  Things  that  we  currently  observe  (can  and  do  observe) 

•  Things  we  would  like  to  observe  (can  but  do  not  observe) 

•  Things  we  cannot  observe,  even  in  principle  (e.g.,  intent) 

•  Things  we  can  indirectly  observe  (i.e.,  inferences  from  observations). 

An  event  “life  cycle”  was  characterized  as 

•  the  recording,  or  sensing,  of  events 

•  interpretation  activities — fitting  events  to  models,  to  determine  if  they  were  really 
caused  by  an  MI  and  to  correlate  them  (or  at  least  attempt  to  attribute  them  to  the 
same  insider) 

•  analysis  of  “significant”  events  to  learn  about  the  MI:  identify  his  goals,  infer  the  past 
and  predict  future  activity,  and  assess  damage,  both  performed  and  potential. 


Data  Collection 

The  group  then  developed  the  following  “waterfall”  diagram  (Figure  5.1)  to  indicate  proc¬ 
esses  that  data  about  an  event  undergo,  and  indicating  that  steps  further  toward  the  upper 
right  will  lead  to  increasing  probability  that  the  hypothesis  being  tested  about  an  event  will 
test  as  “true.” 
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Collection  and  Analysis 

The  diagram  in  Figure  5.1  lists  six  categories  of  data.  But  what  processes  create  one  form  of 
data  from  another?  Given  an  event  (the  leftmost  box  in  Figure  5.1),  the  group  characterized 
the  collection  effort  as  having  the  steps  shown  in  Figure  5.2,  involving  the  actions  of  sensors 
and  detectors  in  converting  raw  events  into  indicators  or  reports  (boxes  3  and  4  in  Figure 

5.1). 

Then,  given  a  report,  the  additional  data  boxes  in  Figure  5.1  are  created  through 
‘‘fusion”  actions  or  analysis  steps  (which  themselves  may  access  various  models  as  part  of 
their  processing),  as  shown  in  Figure  5.3. 

In  this  manner  (as  shown  in  Figures  5.2  and  5.3),  raw  events  are  transformed 
through  a  series  of  processing  steps  into  a  case,  or  set  of  incidents  (the  rightmost  box  in  Fig¬ 
ure  5.1). 

Figure  5.1 

Data  Collection  Steps  Regarding  an  Event 


Figure  5.2 
Collection  Steps 
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Figure  5.3 
Analysis  Steps 


Observables 

The  group  then  turned  its  attention  to  the  observables  resulting  from  events  caused  by  mali¬ 
cious  insiders.  Its  general  approach  was  to  create  a  taxonomy  for  the  insider,  including 
his/her  goals,  actions  to  support  those  goals,  and  observables  associated  with  the  actions. 
Then,  for  each  observable,  the  questions  are 

•  Which  are  detectable  and  which  are  undetectable? 

•  What  new/improved  sensors  are  required  to  adequately  capture  the  observable? 

•  What  information  must  be  included  in  reports  regarding  these  observables  for  the 
reports  to  be  useful? 

The  group  decided  to  focus  on  cyber  events  only  (that  is,  information  that  information 
assurance  researchers  and  operators  could  provide  to  counterintelligence  personnel),  and  to 
disregard  most  physical  observables,  even  if  they  can  be  converted  into  cyber  observables. 

In  briefing  these  results,  the  group  felt  it  had  made  a  good  start  on  this  effort,  but 
that  more  work  would  be  required  to  complete  the  task. 

Observables  from  Attacks  on  Confidentiality 

Regarding  attacks  on  the  confidentiality  of  information,  the  group  felt  there  were  two  means 
by  which  those  attacks  could  be  detected: 

•  Identify  data  access  consistent  with  the  MI  threat. 

•  Detect  any  exfiltration  (with  the  need  for  special  exceptions  for  moving  data  among 
organizations  or  among  security  levels). 

Disclosure  of  confidential  information  could  be  performed  by  a  malicious  insider  through  a 
number  of  means: 

•  Accessing  legitimately 

•  Violating  “need  to  know” 

•  Stealing/using  another  identity 

•  Exploiting  a  misconfiguration 

•  Exploiting  an  application  vulnerability 

•  Violating  access  control  (e.g.,  via  privilege  escalation). 


Event  Characterization  33 


Malicious  insider  access  to  the  confidential  data  could  create  observables  in  the  following 
areas: 

•  Host  activity  (system  calls,  resource  utilization,  etc.) 

•  Host,  operating  system  (OS),  middleware  audit/log  data 

•  Host  activity  records  (e.g.,  browser  cache) 

•  Network  activity 

•  Application  records  of  access  (but  probably  does  not  apply  to  exploitation  of  an 
application  vulnerability). 

Exfiltration  of  confidential  data  obtained  in  an  exploit  could  then  be  achieved  by 

•  writing  the  data  to  various  media 

•  media/HW  insertion 

•  network  activity  (e-mail,  scp,  ftp,  replication,  print,  etc.) 

•  context  switches 

•  application-level  commands 

•  host,  application,  and  server  print  logs 

•  unobservable  techniques:  memorize,  create  images  (camera),  direct  broadcast. 

Observables  from  Corruption  of  Information 

The  group  indicated  that,  for  automated  systems,  the  distinction  between  overt/blatant  and 
covert/subtle  changes  to  intelligence  data  or  information  is  not  relevant;  it  requires  human 
judgment  to  make  that  distinction. 

The  observables  that  might  be  detected  resulting  from  adding,  changing,  or  deleting 
data  might  be 

•  data  inconsistency  (to  the  extent  that  consistency  can  be  codified  in  a  model) 

•  host  activity  (system  calls,  resource  utilization,  etc.) 

•  host,  OS,  middleware  audit/log  data 

•  host  activity  records  (e.g.,  browser  cache) 

•  network  activity 

•  application  records  of  access  (hopefully  characterized  in  semantic  terms,  e.g.,  a  global 
find/replace). 

Observables  from  Degradation  of  Availability/ Access  to  Information 

A  third  type  of  attack  on  intelligence  data  involves  denying  others  access  to  critical,  relevant 
information.  Access  might  be  degraded  or  denied  to  networks,  hosts,  applications,  or  by 
changes  in  system  policy  (e.g.,  as  recorded  in  online  distribution  and  dissemination  lists). 
Observables  resulting  from  attempts  at  such  denial  are  as  follows: 

•  Degrade  networks 

—  Network  infrastructure  audit/log  data 

—  Host,  OS,  middleware  audit/log  data  (e.g.,  broken/denied  connections)  that  is 
causing  or  resulting  from  the  denial 
—  Direct  network  measurement  (e.g.,  tcpdump). 
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•  Degrade  hosts 

—  Host,  OS,  middleware,  application  audit/log  data  (e.g.,  process  accounting,  gen¬ 
eral  host  statistics) 

—  Application  indicators  (e.g.,  misbehaving  applications). 

•  Degrade  applications 
— An  open  challenge 

—  Requires  an  application  specification  (may  be  able  to  check  against  a  quality  of 
service  [QoS]  model/guarantee). 

•  Invoke  system  policy  change  or  exploit  system  policy  (e.g.,  cause  a  user  to  be  locked 
out) 

—  Host,  OS,  middleware,  application  audit/log  data 
—  Problem  ticket  trends. 

Observables  from  Pre-Attack  Activities 

Finally,  the  group  listed  several  categories  of  observables  from  an  attacker's  reconnaissance  or 
discovery  activities,  or  activities  to  obtain  access  to  a  system  or  resource: 

•  Reconnaissance/discovery 

—  Network  activity  (e.g.,  probes) 

—  Host,  OS,  middleware  audit/log  data 
—  Honeypots  (special  role) 

•  Acquire  access  or  control  of  a  system/resource  (lifecycle  or  post-deployment  attacks) 

—  Host,  OS,  middleware  audit/log  data. 


Research  Issues  and  Questions 

This  group's  “grand  challenge”  research  questions  were  the  following. 

Research  Issues — Event-Related 

•  An  adequately  expressive  language  for  describing  and  recognizing  patterns  and  events 
is  needed 

—  To  facilitate  identification/classification  of  user  behavior 
—  To  codify  tell-tale  signs  of  insider  activity 
—  Requires  multilevel  abstractions  and  mechanisms 
—  To  express  uncertainty  in  characterizing  events 

—  Requires  features  like  landmark  times,  temporal  intervals,  and  temporal  and  state 
relationships. 

•  Examples  of  event  reporting  language  research  efforts  exist  that  might  be  built  upon 
and  instructive,  such  as 

—  STATL  (attack  recognition,  including  sequential,  iterative,  and  conditional 
events) 

—  CISL  (ID  reporting;  useful  lessons  learned  from  CIDF) 

—  CYCL  (knowledge  representation  and  reasoning,  very  general) 

—  P-BEST  (Emerald  recognition  and  correlation) 

—  IDMEF  (Intrusion  Detection  Message  Exchange  Format). 
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•  The  implications  of  varying  levels  of  abstraction  must  be  recognized  (and  addressed). 
—  There  are  difficulties  to  be  overcome  in  unifying/flattening  or  decomposing  event 

records. 

—  Some  detectors  (e.g.,  some  data  fusion  systems)  may  require  a  single  level  of 
abstraction. 

•  Can  events  be  associated  with  MI  attack  phases  or  goals? 

•  What  is  an  effective  way  to  prioritize  events,  incidents,  etc.,  for  human  analysis? 
What  information  needs  to  be  captured  to  perform  the  prioritization  and  to  give  the 
human  analysts  a  good  starting  point? 

Research  Issues — Creating  Useful  Sensors 

•  What  capabilities  and  attributes  should  sensors  have? 

•  Some  novel  sensor  ideas  are  worth  exploration: 

—  Honeytokens 

•  Simple  and  relatively  easy  to  deploy 

•  Major  drawbacks:  “bad”  (inaccurate)  data  are  introduced  into  the  system 

•  Often,  may  be  rejected  by  certain  IC  organizations 

•  Effectiveness  against  sophisticated  (especially  malicious  insiders)  is  questionable 
—  Bloom  filters:  highly  compressed  signatures  (e.g.,  for  a  document) 

•  Example:  Trace  denial  of  service  (DoS)  attacks  using  packet  signatures 

•  Example:  Watch  for  the  signature  of  a  honeytoken  in  host  and  network  activ¬ 
ity* 

—  Social  network  analysis:  Watch  for  changes  in  user  and  system  communication 
patterns. 

•  Need  models  of  what  analysts  should  be  doing. 

—  Semantic  analysis:  Watch  for  changes  in  a  user’s  “topic  of  interest.” 

•  Need  to  model  a  user’s  data  flows  (e.g.,  e-mails,  queries,  etc.)  semantically. 

•  What  information  needs  to  be  recorded  in  an  event  record? 

—  What  context  is  required? 

—  What  constitutes  a  meaningful  event  record? 

•  What  standards  and  specifications  would  be  useful  for  ensuring  that  sensor  reports 
are  accurate  and  that  sensors  can  be  easily  integrated  into  new  systems  (e.g.,  a  com¬ 
mon  interpretation  of  terms,  a  commonly  agreed  upon  scale  for  severity  and  confi¬ 
dence,  etc.)?' 

—  What  standard  reporting  formats,  syntaxes,  and  methods  for  recording  meaning 
(semantics)  are  required? 

—  What  does  it  mean  for  a  sensor  to  be  “self-describing”?  Is  this  a  necessary  and/or 
useful  capability? 

•  What  can  be  done  to  protect  sensor  systems  from  attack  and  ensure  sensor  outputs 
are  trustworthy? 

•  How  can  applications  be  instrumented  to  extract  useful  data? 

•  What  must  be  done  to  meet  “chain  of  evidence”  requirements? 

Research  Issues— Sensor  Applications 

•  How  can  collaborative  processing  (combining  evidence  by  using  multiple  sensors  in 
concert  to  gain  higher  overall  confidence  in  sensor  reports)  be  formalized? 
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—  What  is  required  so  that  differing  sensors  can  participate  in  a  collaborative  proc¬ 
essing  system? 

—  How  should  individual  sensor  inputs  be  weighted? 

—  Can  “fusion”  be  used  to  create  useful  synthetic/compound  events? 

—  Which  techniques  are  most  amenable  to  scaling  to  large  datasets  associated  with 
“low  and  slow”  attacks? 

•  What  sensor  data  or  metadata  are  required  to  detect  changes  in  a  user's/machine's 
logical  identity  (i.e.,  his/its  role)?  Can  events  be  matched  to  role-based  access  control 
models  “in  the  background”  to  detect  when  users/machines  switch  roles? 

•  How  can  trends  be  identified  and  represented  in  sensor  reports? 

•  How  can  models  and/or  policy  be  used  to  enhance  what  has  been  observed  (by 
refining  observation  streams  via  noise  reduction)?  For  instance,  if  policy  disallows 
port  scans,  all  port  scans  merit  investigation. 

•  How  can  false  alarms  and  false  positives  be  identified? 

—  How  should  detectors  be  tuned,  and  what  issues  affect  the  tuning? 

—  Tuning  must  be  continuous,  and  needs  to  accommodate  “model  drift”  (e.g., 
changes  in  traffic,  data,  and  other  behaviors)  but  cannot  be  exploitable  by  an 
adversary. 

Research  Issues — Building  and  Working  with  Models 

•  How  should  models  be  expressed? 

—  How  can  events  be  related  to  those  models? 

—  Parameterization  of  model  impacts  its  performance  (e.g.,  a  neural  network  has 
learning  and  momentum  constants  but  users  don’t  understand  them). 

•  How  can  degree  of  fit  to  or  deviation  from  the  model  be  measured? 

—  How  much  deviation  merits  treating  an  event  as  “significant?” 

Research  Issues— Testing  and  Evaluation 

•  What  datasets  are  required  to  test  sensors? 

—  What  metrics  are  relevant  to  datasets? 

—  What  metadata  (e.g.,  ground-truth)  need  to  be  associated  with  the  dataset?  Where 
do  these  metadata  come  from? 

•  What  tools  are  required  to  manipulate  datasets  into  a  form  that  can  be  processed? 

Research  Issues — Miscellaneous 

•  To  support  analysis,  the  state  of  the  subject  and  object  when  an  event  occurred  needs 
to  be  available — how  should  this  be  done? 

—  Explicitly:  Record  system  state  data  directly  in  the  event  record. 

—  Implicitly:  Recreate  the  state  of  the  system  from  other  recorded  observations 
(some  of  which  may  be  in  prior  event  records). 

—  What  are  the  pros  and  cons  of  self-contained  versus  cumulative  audit  records? 
What  are  the  pros  and  cons  of  the  associated  redundancy? 

•  When  should  audit  records  be  generated  for  an  action?  (Consider  non-atomic 
actions:  What  are  the  pros  and  cons  of  auditing  before  and  after  the  action  is  com¬ 
pleted?) 
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•  How  should  the  level  of  attacker  sophistication  impact  sensor  development  and 
research? 

•  How  can  sensor  information  be  shared  effectively? 

—  How  can  sensors  be  most  effectively  used,  deployed,  and  directed?  What  feedback 
can  be  used  to  help  with  these  tasks? 

•  How  can  backward  chaining  from  a  hypothetical  attack  to  the  events  that  comprise 
the  early  stages  of  the  attack  be  accomplished  in  an  automated  manner? 

—  We  need  to  determine  if  events  are  indications  and  warning  of  that  attack  at  the 
time  they  are  observed. 


Grand  Challenge  Research  Problems 

The  group  concluded  its  deliberations  by  creating  problem  descriptions  and  research  chal¬ 
lenges  for  two  “grand  challenges.” 

Challenge  1:  Combining  Events 

Problem  Description.  Combine  events  from  one  or  more  sensors  (possibly  of  various  types 
and/or  different  levels  of  abstraction)  to  facilitate  building  systems  that 

•  test  hypotheses  about  MI  activity 

•  gain  higher  overall  confidence  in  sensor  reports 

•  detect  MI  activity  that  is  not  detectable  using  a  single  event  record 

•  reduce  data  without  adversely  impacting  detection. 

Challenge.  Determine  what  is  required  so  that  arbitrary  sensors  can  participate  in  a 
collaborative  processing  system.  This  involves  the  following: 

•  Developing  a  “calculus  of  evidence”  (Does  your  evidence  support  your  hypothesis?) 

•  Dealing  with  metrics  (e.g.,  normalization,  new  metrics,  meta-metrics,  finding  a 
common  frame  of  reference,  weighting  of  individual  inputs,  syntax  and  semantics  of 
metric  reports,  dealing  with  multiple  levels  of  abstraction) 

•  Identifying  and  expressing  relationships  among  pieces  of  evidence 

•  Determining  how  “fusion”  can  be  used  to  create  useful  synthetic/compound  events 

•  Identifying  techniques  that  can  scale  to  large  datasets  associated  with  “low  and  slow” 
attacks. 

Challenge  2:  Exploiting  Models  and  Policies 

Problem  Description.  Improve  detection  performance  by  using  models  and  policies. 

•  Models 

—  Run-time:  Focus  attention  on  the  most  significant  observations  (feature  extrac¬ 
tion). 

—  Design-time:  Reason  about  users  and  systems  to  recognize  observations  that  are 
not  currently  receiving  focus  (this  can  guide  development  of  novel  policies  and 
detection  methods). 
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•  Policies 

—  Run-time:  Policy  violations  are  strong  indicators  of  unauthorized  behavior. 

—  Design-time:  Direct  the  deployment  of  sensors  (to  observe  all  policy  violations). 

Challenges. 

•  How  should  models  and  policies  be  expressed,  monitored,  measured,  etc.? 

•  How  can  events  be  related  to  models  and  policies? 

•  How  can  machines  (software  programs)  understand  the  implications  of  policies? 

•  How  do  we  design  models  that  are  usable  by  “regular  users?” 

•  What  sensor  data  and  metadata  are  required  to  detect  changes  in  a  user’s/machine’s 
logical  identity  or  role? 


APPENDIX  A 

Workshop  Invitation 


February  2,  2004 
Dear  Colleague, 

Public  concern  for  information  security  has  been  focused  on  the  problem  of  preventing  harm 
resulting  from  the  actions  of  an  outside  attacker.  However,  there  is  greater  risk  of  damage 
due  to  hostile  actions  performed  by  insiders,  possibly  in  collaboration  with  agents  of  an  out¬ 
side  organization.  One  insider  may  transmit  large  volumes  of  sensitive  information  outside 
the  organization  with  little  risk  of  detection.  Another  insider  might  make  subtle,  or  not  so 
subtle,  modifications  to  a  critical  database,  resulting  in  large  losses  in  human  or  financial 
terms.  These  behaviors  are  especially  insidious  if  the  insider  installs  software  that  could  per¬ 
petuate  this  behavior  even  after  the  attacker  has  left  the  organization. 

The  risk  of  insider  attacks  is  greatest  for  systems  that  contain  high  value,  mission 
critical  data.  These  high  value  targets  may  be  classified  or  unclassified  Government  systems 
or  systems  in  the  private  sector.  They  all  tend  to  attract  the  attention  of  well-funded  organi¬ 
zations,  including  foreign  governments,  that  are  willing  to  recruit  insiders  in  an  effort  to 
mount  sustained,  well-planned  penetrations  into  an  organization’s  cyber  assets. 

The  sorts  of  attacks  mounted  by  these  adversaries  depend  on  many  factors,  including 
motivation  and  objectives,  level  of  knowledge  about  the  system,  authorized  access  to  the 
system,  tolerance  for  risk,  and  specific  computer  skills,  including  the  skills  of  outside  collabo¬ 
rators  working  with  the  insider.  Effective  defense  against  insider  attacks  must  be  based  on  a 
realistic  understanding  of  the  behaviors  that  different  classes  of  insiders  are  likely  to  exhibit. 

ARDA  invites  you  to  participate  in  a  workshop  on  “Understanding  the  Insider 
Threat.”  This  three  day  workshop  will  bring  together  members  of  the  Intelligence  Commu¬ 
nity  (IC)  with  specific  knowledge  of  IC  document  management  systems  and  IC  business 
practices,  individuals  with  knowledge  of  inside  attackers,  both  within  and  outside  the  IC, 
and  researchers  who  are  involved  in  developing  technology  to  counter  insider  threats.  The 
objectives  of  the  workshop  are: 
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•  To  generate  and  capture  domain  knowledge  that  will  benefit  the  broad  base  of 
researchers  studying  the  Insider  Threat.  This  includes  but  is  not  limited  to  knowl¬ 
edge  about: 

—  Inside  attacker  characteristics,  including  the  vulnerabilities  they  tend  to  exploit, 
and  the  attack  methods  they  use, 

— Attack  characterization,  including  the  necessary  or  likely  pre-conditions  for  an 
attack,  the  observables  generated  during  an  attack,  and  the  effects  of  the  attack. 

—  The  network  and  application  systems  used  by  the  IC  for  document  management, 
including  the  mechanisms  used  to  protect  the  systems  and  data. 

—  IC  business  models  for  generating  and  controlling  access  to  documents. 

•  To  foster  cooperation  among  researchers  by  developing,  to  the  extent  it  is  practical, 
methods  for  describing  common  aspects  of  their  work,  such  as  event  characterization, 
attack  and  attacker  classification,  etc. 

•  To  focus  researchers  on  specific  systems  and  problems  of  interest  to  the  IC.  We 
expect  these  to  take  the  form  of  challenge  problems. 

We  are  looking  forward  to  a  fruitful  discussion  on  these  important  topics.  ARDA 
intends  to  document  the  results  of  the  workshop  in  a  RAND  report. 

Enclosed  is  additional  information  regarding  the  workshop  format,  logistics,  and 
some  important  pre-work  we  would  like  each  of  you  to  do  prior  to  the  meeting. 

Workshop  Location  and  Dates: 

The  workshop  will  be  held  2-4  March  2004  at  McAfee  Research,  Network  Associates,  Inc., 
15204  Omega  Drive,  Suite  300,  Rockville,  MD  20850.  Office  phone:  301-527-9500.  See 
below  for  directions  and  hotel  information. 

Workshop  Format: 

As  you  can  see  from  the  appended  agenda,  this  workshop  will  be  less  presentation-centric 
than  many  workshops.  The  approach  is  to  have  presentations  by  domain  experts  to  provide 
background  material  and  to  then  break  into  subgroups  to  discuss  issues  and  approaches,  fol¬ 
lowed  by  plenary  sessions  to  compare  results  and  synchronize  the  groups.  When  registering, 
please  provide  a  prioritized  list  of  the  subgroup  topics  (identified  in  the  agenda  below) 
ranked  according  to  your  interests  and  expertise.  We  will  use  these  lists  to  pre-assign 
attendees  to  breakout  sessions. 

Pre-Workshop  Material: 

Prior  to  the  workshop  we  will  distribute  a  package  of  reading  material  to  help  you  prepare. 
We  request  that  each  of  you  please  prepare  (no  more  than  2  pages)  some  informal  thoughts 
on  the  following  topics  as  related  to  your  area  of  expertise: 

i.  A  description  and  assessment  of  the  top  two  or  three  issues  associated  with  under¬ 
standing  the  Insider  Threat, 

ii.  Your  thoughts  on  how  to  address  these  issues,  and 

iii.  Very  brief  descriptions  of  related  efforts  being  pursued  in  your  area  of  expertise. 
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Registration: 

By  Feb  10,  please  e-mail  your  registration  information,  or  regrets,  to  research^ 
mcafeesecurity.com.  Include  the  following  details: 

•  Full  name 

•  Prefix  (Mr.  /  Ms.  /  Dr.  /  military  rank  /  etc.) 

•  Name  for  badge 

•  Organization 

•  Postal  address 

•  E-mail  address 

•  Telephone  number 

•  Fax  number 

•  Cell  number 

•  Hotel  (if  applicable) 

•  Your  prioritized  list  of  subgroup  topics  ranked  according  to  your  interests  and  exper¬ 
tise 

Meals  /  Refreshments: 

Continental  breakfast,  lunch,  afternoon  snack,  and  drinks  will  be  provided  each  day. 

Workshop  Fee: 

There  is  a  workshop  fee  of  $75.00  to  cover  cost  of  meals,  refreshments,  and  workshop  sup¬ 
plies.  By  February  17,  2004,  print  a  copy  of  your  registration  information  and  mail  it,  along 
with  check  or  money  order  payable  to  Network  Associates  for  your  workshop  fee,  to  Dana 
Coon,  McAfee  Research,  15204  Omega  Drive,  Suite  300,  Rockville,  MD  20850. 

Information  on  Hotels: 

There  are  rooms  generally  available  at  a  wide  variety  and  number  of  hotels  in  the  area.  Please 
make  your  reservations  as  soon  as  possible.  Here  are  a  few  suggestions: 

Homestead  Gaithersburg/Rockville 
2621  Research  Blvd. 

Rockville,  MD 
Tel:  301-987-9100 

Courtyard  by  Marriott-Rockville 
2500  Research  Blvd. 

Rockville,  MD 
Tel:  301-670-6700 

SpringHill  Suites  Gaithersburg 
9715  Washingtonian  Blvd. 

Gaithersburg,  MD  20878 
Tel:  301-987-0900 
Van  service  available 
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Gaithersburg  Marriott  Washingtonian  Center 
9751  Washingtonian  Blvd. 

Gaithersburg,  MD  20878 

Tel:  301-590-0044 

Rooms  have  not  been  blocked. 

Directions  to  McAfee  Research: 

McAfee  Research  is  located  in  the  Rockville,  MD,  offices  of  Network  Associates,  in  the 
greater  Washington,  DC  metropolitan  area,  off*  Interstate  270. 

From  1-270  NORTH,  take  Exit  8  for  Shady  Grove  Road  WEST.  Stay  in  the  right 
lanes  on  the  frontage  road  until  it  veers  right,  then  stay  in  one  of  the  left  2  lanes  and  proceed 
to  the  light.  Make  a  LEFT  onto  Shady  Grove  Road.  Cross  over  1-270.  Turn  RIGHT  onto 
Research  Boulevard.  Turn  LEFT  onto  Omega  Drive.  The  Network  Associates  building  is  on 
the  right. 

From  1-270  SOUTH,  take  Exit  8  for  Shady  Grove  Road  WEST.  Stay  in  the  right 
lanes  on  the  frontage  road  until  it  veers  right,  and  then  stay  in  the  right  lane.  Take  the  exit 
for  Omega  Drive  in  the  far  right  lane.  (If  you  miss  the  Omega  Drive  exit,  you  can  still  pick 
up  the  directions  from  1-270  NORTH  above,  starting  with  the  point  where  you  cross  over  I- 
270.)  Turn  LEFT  onto  Omega  Drive.  The  Network  Associates  building  is  on  the  right. 

Turn  RIGHT  into  the  entrance.  The  Network  Associates  building  is  now  to  your 
right  front.  A  4-foot  cube-shaped  sign  with  the  Network  Associates  logo  and  name  is  in  front 
of  the  building. 

Enter  the  building  into  the  lobby  area  through  the  double  glass  doors.  Take  an  eleva¬ 
tor  located  on  your  left  to  the  third  floor. 

Parking 

Parking  is  in  the  areas  in  front  of,  to  the  right  of,  and  behind  our  building.  You  may  ignore 
signs  indicating  permit  parking.  The  only  restrictions  are  marked  handicapped  spaces. 

Shuttle  /  Van  Service 

If  you  wish  to  use  this  service,  contact  your  hotel  desk  for  availability  and  procedure.  Note 
that  there  may  be  a  fee. 

Contact  Information— For  more  information,  please  contact: 

Technical: 

Dick  Brackney:  301-688-7092 
John  Farrell:  443-479-4370 

Administrative/Logistics: 

Jack  Oden:  301-947-7159,  Cell:  703-402-8574 
Dana  Coon:  301-947-7275. 
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Workshop  Agenda 


ARDA  Understanding  the  Insider  Threat  Workshop  Agenda 

The  general  approach  for  the  workshop  will  be  to  use  the  large  group  to  provide  background 
information  to  the  participants,  focus  the  group,  and  gather  results,  and  to  use  a  set  of  small- 
group  breakout  sessions  to  discuss  the  issues  and  recommend  solutions  or  areas  of  needed 
research. 


2  March  2004 

The  objective  for  this  day  is  to  establish  a  baseline  context  and  knowledge  set  for  you  to  use 
for  the  rest  of  the  workshop.  Speakers  with  domain  knowledge  will  provide  introductory 
presentations,  and  each  of  you  will  have  the  opportunity  to  state  your  priorities  and  to 
describe  what  you  have  to  offer  the  group. 

7:30  AM  Continental  Breakfast/Coffee 

Session  1:  Workshop  Purpose  and  Context — Plenary  Session 

8:00  Welcome  and  Workshop  Charge — Dick  Brackney 

8:15  Introductory  Remarks — Sherrill  Nicely,  Information  Assurance  Director,  IC 
CIO 

8:30  Speaking  with  Analysts:  Observations  of  Current  Practices  with  Massive  Data, 
William  Wright,  Oculus  Info  Inc. 

9:10  IC  Document  Management  and  Dissemination  Systems,  speaker  TBD 
10:00  Break 

10:30  Overview  of  the  Hanssen  case,  Robert  Anderson,  the  RAND  Corporation 
10:50  Insider  Behavior — Speaker  TBD 

11:30  Report  on  the  MITRE  Workshop  on  Indications  and  Warnings  for  the  Insider 
Threat:  Mark  Maybury,  The  MITRE  Corp. 


12:15  Lunch 

Session  2:  Identify  Needs  and  Research  Topics — Breakouts 
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1 :30  Review  agenda,  expectations,  and  first  breakout  assignment 

Break  into  subgroups:  IC  System  Models,  Attacker  Models,  Event  Characteriza¬ 
tion,  Vulnerabilities  and  Exploits 
1:45  Review  the  pre-work  (use  to  scope  the  issues) 

Each  Group  Participant  Lists  (<  5  minutes  each): 

—  Knowledge  acquired  to  date 
—  Top  needs 

—  Most  promising  sources  and  approaches  for  acquisition 
—  Summarize  key  points  in  Power  Point  file 
3:30  Break 

4:00  Report  out  to  large  group  (15  minutes  per  group) 

—  Each  group  presents  key  points  to  large  group 
—  Discuss  for  clarity 
—  Consolidate  points  where  possible 


3  March  2004 

The  objective  for  today  is  to  identify,  generate,  and  capture  the  knowledge  that  researchers 
need  to  work  in  this  area.  We  expect  this  to  happen  through  a  dialogue  between  researchers, 
domain  experts,  and  security  practitioners.  Tomorrow  you  will  use  this  knowledge  to  define 
a  set  of  challenge  problems  for  the  researchers. 

7:30  AM  Continental  Breakfast/Coffee 
Session  3:  Generating  the  Knowledge 

8:00  Review  Agenda,  expectations,  and  second  breakout 

Break  into  subgroups:  IC  System  Models,  Attacker  Models,  Event  Characteriza¬ 
tion,  Vulnerabilities  and  Exploits 

8:15  Identify  and  address  top  knowledge  needs 

—  Identify  what  is  known:  document  sources 
—  Identify  what  else  researchers  need  to  know 

•  Work  as  team  to  fill  knowledge  gaps 

•  Identify  approaches  for  further  work 
—  Summarize  key  points  in  Power  Point  file 

10:15  Break 

10:45  Large  group  report  out  (15  min  per  group) 

—  Share  results  with  large  group  &  discuss 
—  Integrate  results  across  groups 


12:00  Lunch 

Session  4:  Organizing  the  Knowledge 

1:15  Reconvene  in  Large  Group:  Review  Agenda,  expectations,  and  third  breakout 
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Break  into  subgroups:  IC  System  Models,  Attacker  Models,  Event  Characteriza¬ 
tion,  Vulnerabilities  and  Exploits 

1:30  Organize  the  knowledge  identified  in  the  AM  session  in  a  manner  usable  by 
researchers 

— Agree  on  principles  for  organization 
—  Organize  the  knowledge 
—  Identify  and  prioritize  gaps 
—  Fill  in  gaps  where  feasible 
3:30  Break 

4:00  Large  group  report  out  (15  min  per  group) 

—  Share  results  with  large  group  &  discuss 
—  Integrate  results  across  groups 
5:00  Adjourn  for  day 


4  March  2004 

The  objective  for  this  day  is  to  define  a  set  of  research  problems  that  are  of  interest  to  practi¬ 
tioners  and  to  identify  useful  intermediate  results  that  we  can  use  to  measure  progress. 

7:30  Continental  Breakfast/Coffee 
Session  5:  Defining  Measures  of  Success 

8:00  Review  results  so  far  and  set  agenda  for  Day  3 
Break  into  subgroups 

8:30  Identify  research  problems  whose  solution  will  benefit  the  security  practitioners 
—  Define  the  problems 

•  Brainstorm,  then  pick  the  top  two 

*  Use  the  knowledge  base  developed  yesterday 

—  Explain  the  benefits  to  the  practitioners  of  solving  the  problems 
—  Identify  useful  partial  results  and  their  benefit  to  the  practitioners 
—  Identify  remaining  knowledge  gaps  and  suggest  ways  to  fill  them 
—  Summarize  key  points  in  Power  Point  file 
10:30  Break 

1 1:00  Large  group  report  out  (15  min  per  group) 

—  Share  results  with  large  group  &  discuss 
—  Integrate  results  across  groups 

12:00  Lunch 

1:00  Wrap-up 

—  Summary  of  results 
—  Final  opportunity  to  comment 
—  Feedback  from  ARDA 
—  Burn  CD’s  and  distribute  to  the  participants 
2:30  Adjourn 


APPENDIX  C 

Links  to  Read-Ahead  Materials 


All  workshop  participants  were  provided  links  to  the  following  materials,  which  were  deemed 
relevant  as  background  and  “read-ahead”  material. 


Reports 

•  DOD  Insider  Threat  Mitigation:  Final  Report  of  the  Insider  Threat  Process  Team 
(http://www.c3i.osd.mil/org/sio/iptreport4_26dbl.doc) 

•  RAND  Insider  Threat  Report  (http://www.rand.org/publications/CF/CFl63/) 

•  SecurityFocus  Newsletter  132  (http://www.securityfocus.com/infocus/1546/) 

•  References  in  Characterizing  Threat  Paper 

•  White  Paper:  Cyber-Security  and  the  Insider  Threat  to  Classified  Information, 
November  1—2,  2000,  Computer  Science  and  Telecommunications  Board  (CSTB) 
(http://www7.nationalacademies.org/CSTB/whitepaper_insiderthreat.html) 

•  Inside  the  Mind  of  the  Insider  (http://www.securitymanagement.com/library/ 
000762.html) 

•  Espionage  Against  the  United  States  by  American  Citizens  1947-2001  (Defense  Per¬ 
sonnel  Security  Research  Center)  (http://www.ncix.gov/news/2002/oct/Espionage. 
pdf) 

•  The  Insider  Threat  to  U.S.  Government  Information  Systems  (NSTISSAM 
INFOSEC)  (http://www.nstissc.gov/Assets/pdf/NSTISSAM_INFOSECl-99.pdf) 

•  R.  Anderson,  R.  Brackney,  T.  Bozek,  Advanced  Network  Defense  Research:  Pro¬ 
ceedings  of  a  Workshop  (CF-159-NSA)  (http://www.rand.org/publications/CF/ 
CF159/CF  159.pdf). 

Collections  of  Cases  and  Reports 

•  Recent  Espionage  Cases  1975-1999  (Defense  Security  Service)  (http://www.dss.mil/ 
training/  espionage/) 

•  DSS  Employee  Security  Training 

—  Spy  Stories  (http://www.dss.mil/training/csg/security/Spystory/Intro.htm) 

—  Treason  101  (http://www.dss.mil/training/csg/security/Treason/Intro.htm) 

•  Federation  of  American  Scientists  Counter-Intelligence  Operations  (http://www.fas. 
org/irp/ops/ci/) 
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•  Cl  Centre  Counterintelligence  Reference  Materials  (http://www.cicentre.com/ 
LINKS_Reference_Material.htm) 

•  Office  of  the  National  Counterintelligence  Executive  (NCIX)  (http://www.ncix.gov/ 
news/index.html). 


Cases 

•  Robert  Hanssen  (FBI) 

— A  Review  of  the  FBI’s  Performance  in  Deterring,  Detecting,  and  Investigating  the 
Espionage  Activities  of  Robert  Hanssen  (Aug  14,  2003)  (http://www.usdoj.gov/ 
oig/special/03-08/index.htm) 

—  Cl  Centre  Article  (http://www.cicentre.com/Documents/DOC_Hanssen__l.htm) 

Webster  Commission  Report  (http://www.fas.org/irp/agency/doj/fbi/websterreport. 
html) 

—  Attorney  General  Webster  Commission  Report  Commentary  (same  as  above) 

—  Hanssen  Computerworld  article  (http://www.computerworld.com/securitytopics/ 
security/story/0,10801,57889,00.html) 

—  Webster  Commission  Cl  Center  article  (http://www.cicentre.com/Documents/ 
DOC__Quotes_Webster_Report.htm) 

•  Moonlight  Maze 

•  Ana  Montes  (DIA) 

—  Cl  Centre  Article  (http://www.cicentre.com/Documents/DOC_Montes_l.htm) 

•  Harold  James  Nicholson  (CIA) 

—  Affidavit  (http://www.fas.org/irp/offdocs/nicholson.htm) 

•  Brian  Reagan  (NRO) 

—  Cl  Centre  Article  (http://www.cicentre.com/Documents/DOC_Regan_l.htm) 

—  Washington  Post  article  on  discovery  of  documents. 


Conferences 

•  Mitigating  the  Insider  Threat:  Proceedings  of  RAND  August  2000  Workshop 
(http://www.rand.org/publications/ CF/ CF 1 63/) 

•  IATF  Forum  (http://www.iatf.net/) 

•  Practical  Counterintelligence  Conference 

•  PACOM  IA  Conference  (http://www.iaevents.com/) 

•  1999  SRI  Report  (http://www.csl.sri.com/users/neumann/insider-misuse/ins.pdf). 
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APPENDIX  E 

Presentation:  The  Robert  Hanssen  Case:  An  Example  of  the 
Insider  Threat  to  Sensitive  U.S.  Information  Systems 


The  Robert  Hanssen  Case: 

An  Example  of  the  Insider  Threat  to 
Sensitive  U.S.  Information  Systems 

Information  in  this  presentation  is  excerpted  from  the  main  report 
of  the  Commission  for  Review  of  FBI  Security  Programs 
(“Webster  Commission”),  March  2002,  and  unclassified, 
unrestricted  portions  of  its  Appendices  A,B 


Unclassified 
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Outline  of  presentation 


•  Background 

•  What  did  he  do? 

•  How  did  he  do  it?  What  might  be  done? 

•  How  could  he?  -  problems  in  FBI  infosec 

•  Relevant  FBI  systems  and  info  architectures 

•  Commission  recommendations  on  solution 
strategies 

•  Some  possible  conclusions  for  this  workshop 
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Background 


•  Robert  Hanssen,  at  the  time  of  his  arrest,  was  an  FBI  Supervisory  Special  Agent 

•  His  treason  is  called  “possibly  the  worst  intelligence  disaster  in  US  history” 

•  Over  22  years,  he  gave  the  Soviet  Union  and  Russia  vast  quantities  of  documents  and 
computer  diskettes  filled  with  national  security  information 

•  This  insider  treason  is  part  of  a  recurring  pattern: 

-  “Since  the  1 930s,  every  US  agency  involved  with  national  security  has  been  penetrated  by 
foreign  agents,  with  the  exception  of  the  US  Coast  Guard” 

-  117  American  citizens  have  been  prosecuted  for  espionage  between  1945-1990  (or  there  is  clear 
evidence  of  their  guilt).  Money  appears  to  be  the  main  factor;  most  spies  volunteered  their 
services.  Prominent  examples: 

*  Aldrich  Ames,  CtA  counterintelligence  officer  (9  years  as  spy) 

•  Ronald  Pclton,  former  intelligence  analyst  for  NS  A 

•  Jonathan  Pollard,  military  intelligence  analyst,  gave  Israel  800  classified  documents. 

1000  cables 

*  John  Walker,  retired  naval  officer,  with  son  and  brother,  supplied  the  Soviets  with  cryptographic 
material 
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What  did  he  do? 


•  Downloaded  large  quantities  of  information  from  the  main  FBI  Automated 
Case  Support  system 

•  Searched  the  Bureau’s  systems  to  see  whether  the  FBI  had  identified  his 
locations  as  drop  sites 

•  Searched  for  his  name  in  the  system  to  see  if  he  was  the  subject  of  an 
investigation 

•  Installed  unauthorized  software  on  his  office  computer 

•  Hacked  into  the  computer  of  a  Bureau  colleague 

-  “...purportedly  to  demonstrate  security  weaknesses  in  the  computer  system.” 

Although  discovered,  he  was  not  punished  for  this. 

•  Photocopied  documents  at  the  Bureau,  and  walked  out  with  them 

•  Walked  into  classified  meetings  uninvited  (“habitually”) 

•  Visited  former  colleagues,  discussed  classified  information 

•  Borrowed  a  TS/SCI  document,  photocopied  it  in  the  back  seat  of  his  car,  then 
returned  it 


*  “Non-technical  methods” 


How  did  he  do  it?  What  might  be  done? 


With  one  exception*,  all  his  activities  involved: 

-  technical  access  for  which  he  was  authorized,  or 

-  “non-technical  methods” 

What  might  be  done  (today)? 

-  USB  “thumb  drives”  —  a  gigabyte  on  a  keychain 

•  Will  all  USB  ports  be  disabled  on  desktop  and  laptop  computers?  Could  they 
be? 

-  CD,  DVD  —  could  all  CD,  DVD  write  access  be  disabled? 

-  Wireless  transmission 

•  A  wireless  card  slipped  into  a  laptop  or  desktop  PC,  transmitting  to  an 
external  “base”  computer  (e.g.,  in  a  parked  car) 

•  Bluetooth  local  transmission  to  a  cellphone,  PDA,  etc. 

•  Infrared  transmission  to  a  local  device  (e.g.,  PDA) 

•  Digital  camera  built  into  a  cellphone  or  PDA 

•  Audio  recording  within  a  PDA  or  cellphone 
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How  could  he? 

Systemic  problems  in  FBI  InfoSec 


•  Bureau  failed  to  develop  an  effective  strategy  to  identify  and  protect  critical 
information 

•  Classified  information  was  moved  into  systems  not  properly  accredited  for  it 

•  Until  recently,  FBI  didn’t  certify  and  accredit  most  of  its  computer  systems,  including 
those  handling  classified  information 

•  Inadequate  physical  protections 

•  Lack  of  adequate  documented  INFOSEC  policies 

•  FBI  failed  to  ascertain  security  requirements  of  ‘"owners”  of  information, 
and  identify  threats  and  vulnerabilities  that  must  be  countered 

•  Users  lack  sufficient  guidance  about  critical  security  features 

•  FBI  failed  to  limit  user  access  to  systems  and  databases 

•  Many  key  InfoSec  positions  remained  unfilled;  when  filled,  staff  have  inadequate  time, 
support,  and  authority 

•  Some  FBI  systems  have  insufficient  resources  to  perform  required  audits;  when  audits 
are  performed,  audit  logs  are  reviewed  sporadically,  if  at  all 
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FBI  systems  architecture  (excerpts) 


•  The  FBI  operates  between  30-60  classified  systems  :A?).  In  many  cases,  the 
boundaries  and  missions  of  these  systems  were  difficult  to  ascertain  i  A6) 

•  Automated  Case  System  (ACS)  is  the  main  investigative  system  of  records 

-  This  was  exploited  “almost  exclusively”  by  Hanssen  during  his  last  period  of 
spying 

-  Has  access  restrictions  by  office  (O)  or  by  list  of  persons  authorized  (P) 

•  But  HQ  personnel  could  access  all  O  cases  opened  by  any  field  office  (A29) 

-  Highly  sensitive  information  can  be  found  in  unrestricted  admin  files,  a  fact  that 
Hanssen  exploited  <  A3o-!>l » 

-  After  9/ 1 1 ,  HQ  mandated  that  no  ACS  case  may  be  restricted  or  deliberately  not 
uploaded  without  approval  of  an  Asst.  Director;  HQ  later  removed  additional  ACS 
case  restrictions  (A35'» 

-  No  formal  procedure  for  terminating  accounts  when  a  user’s  need  for  access  ended 


Note:  Symbols  such  as  (A 7}  provide  appendix  and  page  reference 
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FBI  systems  architecture  (excerpts,  cont.) 

•  Trilogy  is  a  3 -part  system  and  network  upgrade  underway 

-  But  it  is  being  rushed  to  completion,  with  security  features  playing  “catch  up” 

•  FBINet 

-  A  Secret-high  FBI  network,  but  TS/SCI  data  has  been  processed  on  it A42> 

•  Diskettes 

-  Many  unmarked,  others  marked  unclassified  --  were  used  on  Internet  terminals  without 
screening  information  on  the  diskettes  :A4-l  i 

•  HQ  Data  Center  Facility 

-  Contains  a  trove  of  classified  information;  has  backup  tapes 

-  In  a  SCIF,  staffed  24  hrs/day.  All  staff  polygraphed.  All  staff  cleared  for  all  SCI 
compartments 

-  “If  one  of  these  persons  were  to  smuggle  these  tapes  out  of  the  FBI  ...  the  entire  content 
of  the  FBI’s  Investigative  Mainframe,  including  ACS  and  the  ASSET  database,  could  be 
restored  outside  the  FBI  and  the  tapes  returned  before  anyone  noticed  them  missing.” 

(A4«i 
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FBI  systems  architecture  (excerpts,  cont.) 


•  Audit  logs  and  trails 

-  Were  used  extensively  to  trace  Hanssen’ s  activity,  after 
the  fact 

-  Investigators  able  to  determine  which  ACS  files  Hanssen 
viewed,  how  he  searched  for  them,  how  long  he  viewed 
them,  and  whether  he  downloaded  them  <a5k> 

•  But  not  whether  he  did  screen  dumps  while  viewing  them 

-  FBI  audit  trails  were  never  reviewed  in  real  time 

-  Other  than  occasional  and  ill-defined  reviews,  no  one  at 
FBI  reviewed  and  analyzed  audit  data  proactively 
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Commission  recommendations  re. 
solution  strategies 

•  Countermeasure  objectives  should  be: 

-  Reduce  the  time  between  defection  and  detection 

-  Reduce  the  number  of  defectors 

-  Reduce  the  number  of  information  compromises  (attacks)  by  a  defector, 
and 

-  Reduce  the  amount  of  damage  caused  by  each  compromise  (or  attack) 

•  Distinguish  three  types  of  “moles:”  nis-m » 

-  Standard  (has  regular  user  privileges) 

-  Privileged  (more  access  and  privileges  (either  mission-privileged,  or  over¬ 
privileged) 

-  Cracker  (bends  or  violates  assigned  permissions  and  privileges) 


10 


Commission  recommendations  re. 
solution  strategies  (cont.) 

•  Consider  requirements  of  18  baseline  INFOSEC  categories  (from  NSA’s 

INFO  SEC  Assessment  Methodology  v<r? 

INFOSEC  documentation 

Contingency  planning 

INFOSEC  roles  and  responsibilities 

Maintenance 

Identification  and  authorization 

Configuration  management 

Account  management 

Labeling 

Session  controls 

Media  controls 

Networking/connectivity 

Physical  environment 

Telecommunications 

Personnel  security 

Auditing 

Education,  training,  awareness 

Malicious  code  protection 

System  assurance 

ll 
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Some  possible  conclusions 
for  this  workshop 

•  It  may  be  shortsighted  to  consider  just  one  “test”  system,  such  as  Intelink 

-  when  the  real  world  comprises  a  congeries  of  systems  linked  in  various  ways,  with 
different  authorities,  owners,  procedures,  criteria  ~  even  in  one  agency 

•  Many  violations  used  “nontechnical  means” 

-  Can  technical  means  aid  in  discovering  these  violations? 

•  If  users  are  unaware/untrained/unappreciative  of  security  markings,  restrictions, 
controls  —  data  could  be  available  to  “normal”  violators,  not  needing  extraordinary 
means 

•  It  is  ever  more  trivial  to  record  large  amounts  of  data  on  tiny  devices,  for  physical 
removal  —  often  with  no  trace  or  audit  log  record 

•  Audit  logs  are  only  as  good  as  the  uses  to  which  they’re  put 

-  Too  many  “false  positives”,  too  many  resources  required  to  operate  and  review,  too 
many  differing  systems’  logs  not  combined,  too  much  data  —  all  weaken  their 
effectiveness 
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APPENDIX  F 

Presentation:  Overview  of  the  Results  of  a  Recent  ARDA 
Workshop  on  Cyber  Indications  and  Warning1 


Insider  Threat  Workshop 
Overview 


Dr.  Mark  Maybury 

m  MITRE 


MITRE 


2  March  2004 


*  This  presentation  is  copyrighted  by  MITRE  Corporation.  Reproduced  by  permission. 
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Overview  of  the  Results  of  a  Recent  ARDA  Workshop  on  Cyber  Indications  and  Warning  61 


Webster  Report  Findings 

•  Goals 

-  Reduce  time  between  defection  and  detection 

-  Reduce  the  number  of  defectors 

-  Reduce  the  number  of  information  compromises 
(attacks)  by  defector 

-  Reduce  the  amount  of  damage  cause  by  each  compromise  (attack) 

•  More  sophisticated  indications  and  warning 

-  classification  [in  contrast  to  profiling]  has  been  used  less 
often  in  the  intrusion  detection  environment.  This  is  because  it  is 
crucial  for  classification  analysis  that  there  be  adequate 
collections  of  data  representing  both  attacks  and  non-attacks. 
Because  this  type  of  analysis  is  new  to  the  intrusion  detection 
world,  rarely  is  this  information  collected  in  the  proper  form.” 

•  Honeynets 

-  “Honeypots  ...  could  be  used  to  catch  moles  that  fit  within  the 
standard  or  privileged  model  and  operate  “below  the  radar” 

Pago  4  — |  | 
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Outline 

•  Motivation 

•  Insider  Threat  Digital  Library 

•  Insider  Scenarios:  PAL,  TIDES  Admin,  Jack 

•  Common  Data  Repository  (CDR) 

•  Proof  of  Concept  Approaches  for  Detection 

•  Evaluation 

•  Findings  and  Recommendations 
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“A  recent  DoDIG  report  indicates  that,  for  [over  1,000]  investigations,  87 
percent  of  identified  intruders  into  DoD  information  systems  were  either 
employees  or  others  internal  to  the  organization” 

Cyber  Indicators  Exist:  Nearly  half  of  suspicious  contact  reports  made  to  the  Defense  Security 
Service  by  defense  contractors  begin  with  an  email  request  for  information,  especially  by 
foreign  organizations,  per  Gene  Smith,  a  DSS  counterintelligence  analyst 


•  Recommendation  No.  14:  Detecting  Improper  Computer  Usage 
and  Enforcing  "Need  to  Know" 

•  The  FBI  should  implement  measures  to  improve  computer 
security,  including 

-  (a)  an  audit  program  to  detect  and  give  notice  of 
unauthorized  access  to  sensitive  cases  on  a  real-time 
basis; 

-  (b)  an  audit  program  designed  to  detect  whether  employees 
or  contractors  are  using  the  FBI's  computer  systems  to 
determine  whether  they  are  under  investigation; 

-  (c)  procedures  designed  to  enforce  the  "need  to  know" 
principle  in  the  context  of  computer  usage;  and 

-  (d)  a  program  designed  to  ensure  that  restricted  information 
cannot  be  improperly  accessed  through  the  use  of  security 
overrides  or  other  means. 

. IT  Ojfc 
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Workshop  Goal 


Design  and  develop 
a  proof  of  concept  system 
for  early  indication  and  warning 
of  malicious  insiders 


Pago  8 
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Definitions 

•  Insider  -  Anyone  with  access,  privilege,  or  knowledge  of 
information  systems  and  services 

•  Malicious  insider  (Ml)  -  motivated  to  intentionally 
adversely  impact  an  organization’s  mission 
(e.g.,  deny,  damage,  degrade,  destroy). 

•  Observable  -  Anything  that  can  be  detected  with  current  tech. 

•  Sensor  -  Measures  an  observable  (e.g.,  login,  print,  delete) 

-  Sensor  logs  -  recording  of  observables 

-  Sensor  stream  -  series  of  observables  from  one  sensor 

•  Indicator  -  Identifiable  event  based  on  sensor  output  logs 

•  Detect  -  Determines  event  based  on  processing  indicators 

•  Report  -  Indications  and  warnings  of  malicious  insider  behavior 

•  Incident  -  Related  set  of  events 

•  Fusion  -  Processing  multiple  sensor  outputs  to  provide  an 
enhanced  result  (e.g.,  more  abstract/concrete,  higher  confidence) 
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Workshop  Participants 

•  Mr.  Dick  Brackney,  ARDA  •  Dr.  Tom  Longstaff,  SEI  of  CMU 

•  Dr.  Steve  Chapin,  Syracuse  •  Mrs.  Paula  MacDonald,  MITRE 

•  Dr.  Brant  Cheikes,  MITRE  •  Dr.  Jack  Marin,  BBN 

•  Dr.  John  Copeland,  Georgia  Tech  •  Mrs.  Sara  Matzner,  ARL:UT 

•  Mr.  Mick  Costa,  MITRE  •  Dr.  Mark  Maybury,  MITRE 

•  Dr.  Sandra  Dykes,  SwRI  •  Mrs.  Bev  Nunan,  MITRE 

•  Mr.  Thomas  Eisenhut,  SwRI  •  Mr.  Jeff  Sebring,  MITRE 

•  Ms.  Penny  Lehtola,  ARDA  •  Mr.  Conor  Sibley,  BBN 

•  Mr.  Jed  Haile,  Logan  Group  •  Mr.  Don  Slife,  consultant 

•  Mr.  Tom  Hetherington,  ARL:UT  •  Mr.  Lance  Spitzner,  Honeynet 

•  Dr.  Wenke  Lee,  Georgia  Tech  •  Mr.  Jeffrey  Picciotto,  MITRE 

•  Mr.  Scott  Lewandowski,  MIT  LL  •  Mr.  Richard  Pietravafle,  MITRE 

•  Mr.  Brad  Wood,  BBN 

And  other  graduate  students  and  technical  staff  such  as  Christian  Sarmoria  and  Cheol-min 
Hwang  at  Syracuse,  undergraduate  students  George  Chamales  and 
Page  12  Ryan  Smith  at  ARL.UT ;  Bob  Gaimari,  Billy  Garrison,  and  Laurie  Damianos  at  MITRE 


Definitions 

•  Insider  -  Anyone  with  access,  privilege,  or  knowledge  of 
information  systems  and  services 

•  Malicious  insider  (Mi)  -  motivated  to  intentionally 
adversely  impact  an  organization’s  mission 
(e.g.,  deny,  damage,  degrade,  destroy). 

•  Observable  -  Anything  that  can  be  detected  with  current  tech. 

•  Sensor  -  Measures  an  observable  (e.g.,  login,  print,  delete) 

-  Sensor  logs  -  recording  of  observables 

-  Sensor  stream  -  series  of  observables  from  one  sensor 

•  Indicator  -  Identifiable  event  based  on  sensor  output  logs 

•  Detect  -  Determines  event  based  on  processing  indicators 

•  Report  -  Indications  and  warnings  of  malicious  insider  behavior 

•  incident  -  Related  set  of  events 

•  Fusion  -  Processing  multiple  sensor  outputs  to  provide  a 
enhanced  result  (e.g.,  more  abstract/concrete,  higher  confidence) 
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Hypotheses  §H| 

•  While  some  Mis  can  be  detected  using  a  single  cyber  ^ 

observable,  other  Mis  can  be  detected  using  a  heterogeneous 
approach  to  indications  and  warning 

-  Test  using  case  analysis  and  experimentation 

-  Increase  the  number  of  insiders  detected 

•  Fusing  information  from  heterogeneous  information  sources 
will  allow  us  to  formulate  more  accurate  and  timely  indications 
and  warning  of  insiders 

-  Type  of  sensor 

(e.g.,  card  reader,  authentication,  printer,  telephone  calls) 

-  Level  of  IP  stack  (e.g.,  from  network  to  application) 

•  Observables  together  with  domain  knowledge  can  help  detect 
inappropriate  behavior  (e.g.,  need  to  know  violations) 

-  Domain  model  (e.g.,  user  role,  asset  value  to  mission) 
helps  distinguish  need  to  know  violations 

-  Domain  models  need  to  be  dynamic  to  distinguish  the 
changing  environment  (user  roles,  asset  value) 


Group  Hypotheses  ^yp 

•  Structured  Analysis 

-  Real-time  analysis  of  log  data  can  classify  some  Ml  behavior 
as  it  occurs. 

•  Fusion 

-  Accumulated  cyber  observables  can  be  used  to  generate 
early  indications  of  a  Ml. 

•  Honeynets 

-  Instrumented  targets  can  contribute  to  early  Ml  identification. 


Overview  of  the  Results  of  a  Recent  ARDA  Workshop  on  Cyber  Indications  and  Warning  67 


68  Understanding  the  Insider  Threat:  Proceedings  of  a  March  2004  Workshop 
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User  Taxonomy 
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Malicious  Insider  Data  Set 

•  Real  network  -  MITRE’S  DMZ 

-  A  separate  network  for 
experimentation  and  sponsor 
community  support 
established  outside  of  the 
MITRE  internal  network 

-  300  -  400  hosts 

-  Various  services:  Web,  news, 
email,  database,  ... 

-  Data  sources  on  network  for 
use  in  scenarios 

-  Deploy  additional  sensors 

•  Three  of  75  users  active  during  Efj 
period  acted  as  malicious  iSQ 
insiders  based  on  historical  and 
project  scenarios  of  insider 


behavior 


Internet 


MITRE 
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Network 


MITRE 

DMZ 


ITG 
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Evaluation  Activity 

Insider  Activity  with  Insider  Activity  with 

Journal  Available  Journal  Revealed  After 


Pal 


Normal  Activity 


Tides 

Admin 


Jack 
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Data 

•  Distributed  sensor  log  collection  system 

-  18  DMZ  hosts  (of  400),  17  sources,  11+M  records 

•  Users 

-  4059  distinct  users  in  logs  (7658  incl.  badge  records) 

-  75  active  DMZ  users,  3  of  which  simulate  malicious  insider 
behavior 

•  Sensor  data  anonymization  system 

-  ~6k  SLOC  (Java,  Perl,  shell,  SQL) 

•  Integration  Framework  software 

-  Publish/Subscribe  data  dissemination  system 

-  Reasoning  management  system 

-  Java  Expert  System  Shell 

-  ~4K  SLOC  (Java) 

-  Common  Data  Repository  (CDR) 
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Multiple  cyber  observables  can  be  fused  in  order  to 
provide  earlier  indications  of  a  malicious  insider 

indications  from  multiple  sources  further  support 
and  strengthen  the  conclusion 
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Execution  Sample 


Indicator  Cyber-Access,  user324,  weight  1,  at  2003-12-10  1 1:14:38,  from 
tides2000.mitre.org 

su  to  user9676  failed  for  non-admin  user  user324  on  /dev/pts/0 

Indicator  Physical-Access,  user295,  weight  5,  at  2003-12-15  19:19:37, 

After  hours  badge  access  for  user295 

Indicator  Cyber-Extraction-Exfiltration,  user2649,  weight  5,  at  2004-01-06 
15:37:28,  from  nrrc-springfield.mitre.org, 

Data  was  uploaded  to  an  external  server  via  FTP  protocol 

Indicator  Cyber-Reconnaissance,  user295,  weight  10,  at  2004-01-09  20:57:18, 
from  nrrc-springfield.mitre.org, 

User  user295  searching  in  non-need-to-know  country  korea 


MGk 
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Execution  Sample  (continued) 


Indicator  Cyber-Communication,  user9,  weight  15,  at  2004-02-10  22:14:48, 
from  cvw.mitre.org, 

User  user9  received  email  with  masqueraded  content  from 
user!  1649@yahoo.com 

Indicator  Cyber-Reconnaissance,  userl,  weight  5,  at  2004-02-10  13:54:15,  from 
nrrc-plymouth.mitre.org, 

Ongoing  Cl  violation  --  066.170.227.074  currently  has  49613  alerts  of  this 
type 

Indicator  Cyber-Extraction -Exfiltration,  user295,  weight  8,  at  2004-02-12 
23:54:58,  from  dmzsrv1.mitre.org, 

User  user295  sent  encrypted  email  to  user9983@comcast.net 

Indicator  Cyber-Extraction-Exfiltration,  userl,  weight  15,  at  2004-02-20 
12:25:03,  from  nrrc-erie.mitre.org, 

User  userl  sent  email  with  masqueraded  content  to  userl ©mitre.Qm^ 
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Experimental  Results 
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. f)  DATA  REDUCTION 

i  •  7.4  M  records  examined  for  75  users 
~"T~  •  259  indicators  for  24  users 
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Experimental  Results 
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•  Developed  evaluation  methods/metrics 

•  Timeliness,  e.g.,  time  from  defection  to  detection 

-  years,  months,  weeks,  minutes 

•  Accuracy 

-  Precision  =  #  correctly  detected  insiders  /  #  reported 

-  Recall  =  #  reported  insiders  /  total  #  actual  insiders 

-  False  positives  =  1 -precision 

-  False  negatives  =  total  #  actual  insiders  -  #  correctly  detected 
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Caveat  on  Experiment  results 

•  Limited  users  (75) 

•  Limited  #  of  algorithms  attempted 

•  Insider  models  motivated  by  actual  insider  behavior  but 
require  community  vetting  (e.g.,  Jack  would  already  know 
network  topology  so  would  not  scan) 

•  Differences  from  1C  networks  (e.g.,  no  guards,  open  network) 

•  Relatively  few  hosts  (18/400)  instrumented 

•  Data  set  collected  over  several  months,  insiders  known  to 
operate  over  years 
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Example  Calculation  of  Fusion  Performance 

Actual 


Identified 

On 

Watchlist 


False  Positives:  2  False  Positive  Rate  =  1/72  =  0. 03 
False  Negatives:  1  False  Negative  Rate  —1/3  =  0.33 
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Performance:  Accuracy 
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*  Stealth  Watch  error  rates  computed  on  host  IPs  in  network;  SAG  and  Data  Fusion  computed  on  users. 

Across  approaches,  correctly  identified  3  out  of  3  insiders 
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Performance:  Timeliness 
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Findings 

•  Ml  effectively  represented  by  historical  case  and  real  experts 
(analyst  (Penny),  app  admin  (Laurie),  system  admin  (Billy)) 

•  Multisource,  multilevel  logging  supports  Ml  behavioral  trace 

-  Careful  sensor  deployment  important  (e.g.,  network 
monitoring  and  application  monitoring) 

-  Not  all  logs  provide  user  or  host  attribution 

•  Cumulative  non-temporal  (data  fusion)  and  temporally 
dependent  structural  analysis  model  both  worked  well 

-  Both  approaches  insensitive  to  weights 

-  Require  expertise  on  alert  sources  and  systems 

•  Automated  broad  classification  of  users  feasible 
(e.g.,  end  user  versus  system  administrator) 
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Limitations 

•  Unclassified,  non  tC  network 

-  Open  network,  no  guards,  no  machine  lock  down 

-  1C  has  swipe  in/out,  common  networking  timing 

•  Relatively  few  hosts  (18/400)  instrumented 

•  Data  set  collected  over  several  months,  insiders  known  to 
operate  over  years 

•  Representative  insider  threats  addressed  ...  many  additional 
models  required  (e.g.,  slow  Jack  attack,  Ana  Belen  Montes) 

•  Common  applications  addressed  (e.g.,  mail,  news,  web),  but 
additional  services  would  have  enabled  richer  scenarios  (e.g., 
instant  messaging,  database  access)  and  analyses  (e.g., 
social  network  analysis). 
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Lessons  Learned 


•  Methodology 

-  Do  scenarios  up  front. 

•  Modify  classical  sensors 

•  Clear  where  to  place  sensors  for  outsider,  not  obvious  where 
you  place  sensors  for  insider 

•  Important  not  only  to  detect  Ml  but  also  provide  evidence  for 
investigation 

•  Underestimated  complexity  of  sensor  selection  and  log 
analysis  leaving  insufficient  time  for  experimentation 

•  Workshop  format  efficient  and  enjoyable:  few  but  key 
physical  meetings,  work  in  between  meetings,  weekly  telecon, 
common  shared  data 


— ITCMr 
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Future  Insider  Knowledge  and  Focus 


Detectable 


Hard  to  Detect 


Occurred 


Not  yet  Occurred 


U  who  attacks \ 
the  network  ) 


Not  yet 
Detectable 


Ana  Helen  Montes\  N on-cyber  component 
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Access  to  Insider  Threat  Data  Set  | 

•  Corpus  created  including  policy,  scenario  creation, 
instrumenting/capturing,  archiving,  anonymization,  and  database 
indexing  network,  application,  and  physical  access  logs, 

•  11+M  records 

•  Based  on  Infosec/legal/HR  review,  availability  beyond  workshop 
requires  addressing: 

-  Security  Vulnerabilities 

•  IP  and  possibly  machine  anonymization 

•  We  believe  no  vulnerability  data  (e.g.,  no  nessus  scans) 

•  Recovery  of  network  topology 

-  Privacy 

-  User  IDs  from  URLs 

-  Machine  names  from  URLs 

-  Approval  from  host  owners  and  users  for  new  use 
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SpyCatcher  Motivation 


•  Community-wide,  corpus  and  metrics-based  evaluation  has 
resulted  in  rapid  advances  in  areas  including 

-  Speech,  e.g.,  ATIS 

-  Machine  translation 

-  Information  retrieval,  e.g.,  TREC 

•  Information  Extraction,  e.g.,  MUC 

•  With  the  development  of  MITRE’s  DMZ  corpus,  we  have  an 
opportunity  to  inject  simulated  (historical  or  projected) 
insider  behavior  and  test 

•  Systems  can  then  be  run  against  this  data,  results  compared, 
and  error  analysis  performed. 


h-tcx 
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SpyCatcher  Challenge  Process 


Select 

Insider 

Case 


Simulate  on 
DMZ(orOSIS) 


Participants 
Sign  Data  NDA 


Data  Sent  to  rh-1  i 

Participants  n  .  mr 

1  Participants  ^ 

Submit  Runs  si 
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Detailed  Plan  iS3P 

•  Each  year  select  an  increasingly  challenging  insider  threat  (e.g.,  start 
with  FBI’s  Robert  Hanssen  then  move  on  to  DIA’s  Ana  Belen  Montes) 

•  Annually  publish  the  kinds  of  logs  and  indicators  that  will  be  present 
in  the  data  set  (e.g.,  no  financial  records  but  system,  network,  and 
application  logs).  Do  this  during  summer  workshops. 

•  Simulate  newly  modeled  threat  in  context  of  real  network  (e.g.,  DMZ) 

•  Groups  sign  non-disclosure  of  data  agreement 

•  Research  groups  are  given  logs  and  asked  to  produce  indicators  and 
warnings  (e.g.,  list  of  suspected  accounts,  likely  targets)  in  short  time 
period  (e.g.,  days). 

•  Separate  categories  for  manual/semi-automated  and  automated 
approaches  (help  discover  new  indications  and  warning  methods). 

•  Award  cash  prizes  for  the  best  (fastest  and  most  accurate)  and  most 
innovative  systems  (e.g.,  $5  or  $10K,  plus  bragging  rights)  as  judged 
by  panel  of  experts 

•  Share  results  at  open  workshop  so  community  learns 
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Summary 


Design  and  develop 
a  proof  of  concept  system 
for  early  indication  and  warning 
of  malicious  insiders  (Mis) 


i»-  MITRE 
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Approach 


Novel  Ideas 


Novel  functional  model  related  to  attack  graphs  which 
will  map  pre-attack  indicators  to  potential  attacks 
(Brad  Wood/Jack  Marin,  BBN;  Steve  Chapin,  Syracuse) 

Automatic  fusion  of  traditional  and  novel  indicators 
(Sara  Matzner,  U  Texas;  Sandy  Dykes,  SwRI) 

Distributed  honeynets  to  acquire  attacker  properties, 
pre-attack  intensions,  and  potential  attack  strategies 
(Lance  Spitzner/Jed  Haile,  Honeynet) 

Multilevel  Monitoring  -  packet  level  (John  Copeland, 
Georgia  Tech),  system  level,  and  application  level. 

Knowledge  Based  Infrastructure  for  l&W  Integration 
(Brant  Cheikes,  Rich  Pietravalle,  MITRE) 
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Accomplishments 

Developed  taxonomies  of  Ml  observables  and  assets 

Developed  scenarios:  analyst  PAL  (historical  need  to 
know  violator)  and  sys  admin  Jack  (projected  attacker) 

Developed  Common  Data  Repository  (CDR):  11+  million 
records  of  cyber  events:  physical  access  (e.g.,  badge 
logs),  network  (e.g.,  StealthWatch,  Snort,  Honeynet), 
host/admin  access  (e.g.,  password,  su,  login),  and 
user/application  level  (e.g.,  web,  mail,  news). 

Real-time  detection  of  PAL  and  Jack  Mis,  exploiting  data 
fusion,  StealthWatch,  and  novel  sensors 

Lance  Spitzner.  Dec  200.1  “Honeypott:  Catching  the  tinkler  Threat”  ACSAC,  Lit  Vegas.  NV. 
Mayburv,  M.,  ft  al.  in  preparation.  Inidcr  Threat  Chalcngr  Workshop:  Final  Report.  MTR 
MBOOOOOH. 

Mavbnry.  M„  ft  al  In  prep.  Toward  a  Modrl/Delection  of  Insider  Behavior. 


Insider  Case  Matrix 


User  TatNfHiomy 
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Accomplishments  (jf 

•  Digital  Library  for  Research  in  Insider  Threat 

-  Affidavits,  case  studies,  case  analysis,  observable/asset 
taxonomy 

-  Briefed  Insider  Modeling  at  ARDA’s  Advanced 
Countermeasures  for  Insider  Threat  (ACIT)  Kickoff 

•  Experimentation  Data  Set  and  data  integration  environment 

-  Developed  Common  Data  Repository  (CDR):  11+ million 
records  of  cyber  events:  physical  access  (e.g.,  badge 
logs),  network  (e.g.,  StealthWatch,  Snort,  Honeynet), 
host/admin  access  (e.g.,  password,  su,  login),  and 
user/application  level  (e.g.,  web,  mail,  news). 

-  Toolset  to  support  anonymization  and  filtering 

•  Developed  scenarios:  analyst  PAL  (historical  need  to  know 
violator  -  EO  12968)  and  sys  admin  Jack  (projected  attacker) 

•  Proof  of  Concept  Tech  Approaches  for  Detections 

•  Test  Cases  for  Evaluation 


Publications 


•  Lance  Spitzner.  Dec  2003  “Honeypots:  Catching  the  Insider 
Threat”  ACSAC,  Las  Vegas,  NV. 

•  Maybury,  M.,  Sebring,  J.,  Chase,  P.,  Chiekes,  B.,  Pietravalle, 
R.,  Costa,  M.,  Brackney,  D.,  Lehtola,  P.,  Matzner,  S., 
Hetherington  ,  T.,  Marin,  J.,  Wood,  B.,  Longstaff,  T.,  Spitzner, 
L.,  Haile,  J.  L,  Cunningham,  R.,  Copeland,  J.,  and 
Lewandowski,  S.  In  preparation.  Toward  a  Model  of  and 
Detection  of  Insider  Behavior. 

•  Maybury,  M.,  Chase,  P.,  Sebring,  J.,  Cheikes,  B.,  Pietravalle, 
R.,  Costa,  M.,  Matzner,  S.,  Hetherington,  T.,  Longstaff,  T., 
Wood,  B.,  Marin,  J.,  Spitzner,  L.,  Haile,  J.,  Lewandowski,  S., 
Cunningham,  R.,  and  Copeland,  J.  in  preparation.  Insider 
Threat  Challenge  Workshop:  Final  Report.  MITRE  Technical 
Report,  MTR. 


-IT 
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APPENDIX  G 


Presentation:  Intelink  Factoids 


Intelink  Factoids 

Pete  Jobusch,  CTO 
Information  Assurance  Directorate 
Intelink  Management  Office 

peteij@intelink.gov 
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Topics 

•  What  is  Intelink? 

•  History 

•  Statutory  and  Policy  Environment 

•  Information  Space  Issues 


What  is  Intelink? 

•  Intelink  is  NOT  a  Network 

•  Intelink  is  NOT  a  Service 

•  Intelink  is  a  Corroboration 


History 


•  Mosaic 

•  Intelink  Services  Management  Center 

•  Intelink  Management  Office 


Statutory  and  Policy 
Environment 

•  Title  50 

•  DCI  Directives 

•  Community  Policies 
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APPENDIX  H 

Presentation:  Glass  Box  Analysis  Project 


UNCLASSIFIED 


Glass  Box  Analysis  Project: 

Overview  for  Insider  Threat 
Workshop 

Dr.  Frank  L.  Greitzer 

Battelle,  Pacific  Northwest  Division  (PNWD) 


PNWD  Glass  Box  Project  Manager:  Paula  Cowley 
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Glass  Box  as  a  NIMD  Hub 
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Novel  Intelligence  from  Massive  Data 
(NIMD) 

■  ARDA  Research  Program 

■  ARDA  Thrust  Manager:  Lucy  Nowell 


lgPMfc>SS  *■  \ 


Glass  Box  Objectives: 

■  Meaningful  data  for  NIMD 
research  community 

■  Utility  software  for  examining 
data 

■  Facilitate  new  tool  development 
by  NIMD  researchers 
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Goal:  Capture  the  Analytic  Process 


Collect  data  from  signed-up  analysts 

•  Perform  analytic  taskings  provided  by  analysts 

•  Collect  “ground  truth”  data  of  what  analysts  actually  did 

•  Capture  cognitive  thought  processes 

•  Capture  queries  and  documents  read 

•  Capture  reports  generated 

•  Capture  “Over  The  Shoulder”  view  of  activities 
Distribute  Database,  Filestore,  and  Access  Tools 
to  NIMD  Researchers 

Enable  reconstruction  and  visualization  of  analytic  process. 
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Glass  Box  Instrumentation  Software  Automatically 
Captures  Workstation  Events 

■  Web  Browser  activity 

•  URLs  and  contents  of  all  pages  visited 

•  Images  displayed  on  Web  pages 

•  Queries  submitted  by  analyst  to  search  engine 

•  Results  of  query 

■  Application  records  (e.g.,  MS  Word,  PowerPoint,  Excel) 

•  Periodic  snapshots  of  documents 

•  Various  actions  such  as  “Find...” 

■  Window  events  (active  window,  location  on  screen,  how  long  open, ...) 

■  File/Save  events 

■  Copy/Paste  events 

■  Keyboard  and  mouse  data 

■  Etc. 


Analyst-initiated  annotations 
are  also  captured. 
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UNCLASSIFIED 

Glass  Box  Software  Also  Provides  Review 
Functions  for  NIMD  Researchers 


■  Tabular  Review  shows  time- 
stamped  activities 

■  "Replay”  Events  in  ‘Deja  View’ 
mode 

■  “Over-the-Shoulder”  View  mode 
(Camtasia) 


Data  enable  visualization 
of  analysis  process. 


ES 


B 
K 

sat 


I 


Jan 

eu 


PARC’s  Web  Behavior  Grapher 
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Glass  Box  Instrumentation  Can  Be  Used  To 
Support  Insider  Threat  R&D 

Possible  Applications 

■  Capture  sample  data  to  support  event  characterization  and 
vulnerabilities  R&D 

■  Create/capture  sample  data  using  insider  threat 
“simulations" 

■  Use  Glass  Box  API  to  integrate/test  proposed  sensors 


NIMD 


n  %  >  %i-;. 
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APPENDIX  I 


Presentation:  Interacting  with  Information:  Novel  Intelligence 
from  Massive  Data  (NIMD) 


UNCLASSIFIED 


Interacting  with 
Information: 

•  •••  j 

Novel  Intelligence 
from  Massive  Data| 

(NIMD)  £] 

*0  *  * 

Dr.  Lucy  Nowell 

NIMD  Program  Manager  $.£' 

arda  | 

Advanced  Research  and 
Development  Activity 


"Wl. 

1-WC 

.  trie* 

iio- 

A&: 

jftor :  •" 


i&Vt  thrift.  ■ 
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Today 


•  NIMD  and  Insider  Threat 

•  Motivation  for  NIMD 

•  NIMD  Research  Agenda 

•  NIMD  Events 

•  Communicating  with  NIMD 


2 


UNCLASSIFIED 


3/9/04 
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NIMD  &  Insider  Threat 


*  Glass  Box  team  and  I  are  here  to  support  you 

*  NIMD  Program  will  facilitate  Insider  Threat  program  access 
to  Glass  Box  software  and  data 

-  Input  on  new  capabilities  to  add  to  GB? 

*  Relevant  NIMD  research  results  will  be  available 

-  Models  of  analytic  process 

-  Cognitive  task/work  analysis 

-  Etc. 
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Growing  NIMD  Knowledge  Base 


Suggestioi 
Hypotheses,  & 
Discoverit 


3.T?iP4 
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Keys  to  NIMD  -  Driving  Ideas 


•  Recognize  analyst's  assumptions  and  biases; 

-  Make  different  assumptions  and  evaluate  how  the  outcome  changes. 

-  Evaluate  implications  of  analytic  bias;  counter  as  needed. 

•  Recognize  analyst’s  strategy; 

-  Employ  different  strategies  and  evaluate  how  the  outcome  changes. 

•  Examine  prior  and  tacit  knowledge  embedded  in  reports  and  queries; 

-  Validate  and  capture  for  use  by  others; 

-  Ensure  that  organizational  prior  knowledge  is  reflected  in  the  analysis. 

•  Find  and  call  out  relevant  data  not  used  in  the  analysis; 

•  Provide  a  mixed-initiative  analytic  environment  that  supports 
simultaneous  tracking  of  multiple  hypotheses  and  exploration  of  massive 
data. 
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Data  Provided  to  Researchers 


•  Glass  Box  Data 

•  Center  for  Non-Proliferation  Studies  data 
on  biological  and  chemical  WMD  and 
terrorism,  adding  nuclear  WMD  this  year 


•  NIMD  does  not  provide  access  to 
Government  databases 
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NIMD  Knowledge  Base(s) 


•  Cognitive  models  of  analysts 

•  Models  of  analytic  strategies 

•  Captured  prior  and  tacit  knowledge 

-  Driven  by  analytic  taskings  and  analysts’  activities 

•  Ontologies  and  other  domain  mappings  of  data 

•  Other  types  to  be  determined  during  research 


•  NOT  aimed  at  capturing  data  about  individuals  other  than 
analysts 
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NIMD  Events 


PI  Meeting  -  May  25-28  in  Crystal  City 
Fall  PI  Meeting  in  Orlando,  dates  TBD 

ARDA  Knowledge  Representation 
Symposia  (sponsored  by  NIMD) 

-  Three  3-day  sessions 

-  August,  October,  January  --  dates  TBD 

-  Goal  is  to  facilitate  interoperability 
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Communicating  with  NIMD 

Dr.  Lucy  Nowell,  PM  -  ltnowel@nsa.gov, 
443-479-8010  or  301-688-7092  (ARDA  ofc) 

Dan  Doney,  NIMD  SETA  - 
qddonev@nsa.gov 

Thomas  Fortney,  NIMD  SETA  - 
tafortn@nsa.gov 

NIMD  web  site  maintained  by  NIST  - 
access  can  be  provided  on  request  from 
Dick  Brackney 
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Thank  You! 

Dr.  Lucy  Nowell 
NIMD  Program  Manager 
ltnowel@nsa.gov 
lucy.nowell@pnl.gov 

http://apdiQitalsupport.mm. ap.org 


3/9/04  UNCLASSIFIED  _ 12 


Bibliography 


Anderson,  Robert  H.,  Thomas  Bozek,  Tom  Longstaff,  Wayne  Meitzler,  Michael  Skroch,  and  Ken 
Van  Wyk.  Research  on  Mitigating  the  Insider  Threat  to  Information  Systems-#2:  Proceedings  of  a 
Workshop  Held  August,  2000.  Santa  Monica,  CA:  RAND  Corporation,  CF-163-DARPA,  2000. 

Anderson,  Robert  H.,  Richard  Brackney,  and  Thomas  Bozek.  Advanced  Network  Defense  Research: 
Proceedings  of  a  Workshop.  Santa  Monica,  CA:  RAND  Corporation,  CF-159-NSA,  2000. 

Commission  for  Review  of  FBI  Security  Programs  (Webster  Commission).  A  Review  of  FBI  Security 
Programs.  U.S.  Department  of  Justice.  March  2002.  Available  at  http://www.fas.org/irp/agency/ 
doj/fbi/websterreport.html. 

Department  of  Defense.  DOD  Insider  Threat  Mitigation :  Final  Report  of  the  Insider  Threat  Process 
Team.  April  24,  2000.  Available  at  http://www.c3i.osd.mil/org/sio/iptreport4_26dbl.doc. 

Doyle,  Jon,  Isaac  Kohane,  William  Long,  Howard  Shrobe,  and  Peter  Szolovits.  “Event  Recognition 
Beyond  Anomaly  and  Signature.”  Proceedings  of  the  2001  IEEE  Workshop  on  Information  Assurance 
and  Security.  United  States  Military  Academy,  West  Point  NY,  5-6  June  2001. 

Heuer,  Richards  J.,  Jr.  Psychology  of  Intelligence  Analysis.  Washington,  D.C.:  CIA  Center  for  the 
Study  of  Intelligence,  1999. 

Lowenthal,  Mark  M.  Intelligence:  From  Secrets  to  Policy ,  2nd  edition.  Washington,  D.C.:  CQ  Press, 
2003. 

Price,  Katherine  E.  Host-Based  Misuse  Detection  and  Conventional  OS  Audit  Data  Collection .  MS 
Thesis,  Purdue  University,  1997. 


113 


